Linux-api Archive mirror
 help / color / mirror / Atom feed
From: Levo D <l-asm@mail9fcb1a.bolinlang.com>
To: <linux-api@vger.kernel.org>
Subject: mmap a file without an overwrite risk?
Date: Tue, 17 Oct 2023 00:56:00 +0000	[thread overview]
Message-ID: <20231017005600.290E4FA139@bolin> (raw)

[-- Attachment #1: Type: text/plain, Size: 826 bytes --]

Attached are 2 c files with main.c having reproduce instructions

I noticed if I try to read a large file it's a bit slow and mallocing/mmap a large amount of memory waits hundreds of milliseconds from the OS. However mmap is incredibly fast but leaves my software open to memory corruption. I attached a reproducable

I tried various MAP flags and couldn't think of a way to get rid of the risk. The gist of the problem is if I mmap a file another process can overwrite data which appears in mine, or delete the file causing my code to have a bus error. Not shown in the code is me trying to write to every page in hopes it'd prevent my memory from being overwritten, it didn't work either.

If there's nothing I can do is there an alternative way to load a file quicker than malloc+read? Files can be >100MB or GBs in size


[-- Attachment #2: main.c --]
[-- Type: application/octet-stream, Size: 1456 bytes --]

/* Reproducable steps
gcc main.c -o app1
gcc app2.c -o app2
fn=/tmp/uB2kCDFlDoRA56gdeuvOVKbL1KHXxMhDXc3pFnhjbhc
dd if=/dev/zero of=$fn count=5120 bs=4096; ./app1 $fn 1 | ./app2 $fn
You'll see "Result was 72" instead of "Result was 0"
dd if=/dev/zero of=$fn count=5120 bs=4096; ./app1 $fn 0 | ./app2 $fn
You'll see the desired "Result was 0"
In app2.c uncomment/comment line 13 and 12
gcc app2.c -o app2
Now this will have a bus error and last line written is waking
dd if=/dev/zero of=$fn count=5120 bs=4096; ./app1 $fn 1 | ./app2 $fn
read is unaffected
dd if=/dev/zero of=$fn count=5120 bs=4096; ./app1 $fn 0 | ./app2 $fn
*/

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/mman.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/stat.h>
int main(int argc, char *argv[])
{
	//a.out filename 1
	if (argc != 3) {
		fprintf(stderr, "Bad args\n");
		return 1;
	}
	
	int fd = open(argv[1], O_RDONLY);
	struct stat s={0};
	fstat(fd, &s);
	char*p;
	if (argv[2][0] == '1')
		p = (char*)mmap(0, s.st_size, PROT_READ , MAP_PRIVATE, fd, 0);
	else {
		p = (char*)malloc(s.st_size);
		read(fd, p, s.st_size);
	}
	write(2, "Sleeping\n", 9);
	write(1, "Sleeping\n", 9); //Triggers the other app to write
	sleep(1);
	write(2, "Waking\n", 7);

	//read first byte of every 4k page
	int sum = 0;
	for (long i=0; i<s.st_size; i+=4096) {
		if (p[i] != 0) {
			int z=0;
		}
		sum += p[i];
	}
	fprintf(stderr, "Result was %d\n", sum);
	return 0;
}

[-- Attachment #3: app2.c --]
[-- Type: application/octet-stream, Size: 401 bytes --]

#include <stdio.h>
#include <unistd.h>
#include <fcntl.h>
int main(int argc, char *argv[])
{
	if (argc != 2) {
		fprintf(stderr, "Bad args\n");
		return 1;
	}
	char buf[4096];
	read(0, buf, 4096);
	int fd = open(argv[1], O_WRONLY|O_CREAT);
	//int fd = open(argv[1], O_WRONLY|O_CREAT|O_TRUNC|O_CLOEXEC); //<-- This one causes a bus error
	write(fd, "Hello", 5);
	fprintf(stderr, "Written %d\n", fd);
}

                 reply	other threads:[~2023-10-17  1:03 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231017005600.290E4FA139@bolin \
    --to=l-asm@mail9fcb1a.bolinlang.com \
    --cc=linux-api@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).