autofs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Marcel De Boer <marcel.de_boer@nokia.com>
To: autofs@vger.kernel.org
Subject: Near-simultaneous automount of multiple directories fails
Date: Fri, 8 Apr 2016 09:55:13 +0200	[thread overview]
Message-ID: <alpine.LRH.2.11.1604080846130.26210@carrot.ant.ipd.priv> (raw)

Hi!

I've already reported this on the CentOS bug tracker a while ago, but I 
thought I'd report it here too.

https://bugs.centos.org/view.php?id=9835

Summarized (there's more information on the bug report): on one of our 
servers we initially saw that every few days one home directory became 
inaccessible. This happened to two different homedirectories (but only one 
at a time) out of the couple hundred we have. We traced this to 
simultaneously scheduled cron scripts running out of the affected 
homedirectories, which caused both directories to be mounted nearly 
simultaneously.

A test setup on a different machine (the primary description from the bug 
report, as the server was not stock CentOS) also showed that if we had 
cron simultaneously mount four directories every 10 minutes, only half of 
them would get mounted every time. On this machine an RPM rebuild of 
autofs made the issue disappear, but it was much more persistent on the 
server.

Eventually it seems that there is an issue in mount_mount() from 
mount_nfs.c; to my untrained eye, it looks like it can get called 
simultaneously from different threads, where they change shared 
information, probably the 'hosts' or 'tmp' lists.

I made a patch that seems to work reliably for our situation, but it's 
very crude, it just makes sure everything touching the 'hosts' list (and 
everything else during that time) does not run in parallel. It might be a 
starting point for someone who knows the code better, though. (Patch was 
made against the code used in the 5.0.5_115 CentOS 6 RPM.)

The server has received some more upgrades in the mean while, so we may no 
be able to reproduce it on that system anymore.

Kind regards,
 	Marcel de Boer


--- autofs-5.0.5-orig/modules/mount_nfs.c	2016-01-05 15:26:55.993014650 +0100
+++ autofs-5.0.5/modules/mount_nfs.c	2016-01-05 15:25:51.434011526 +0100
@@ -40,6 +40,9 @@
  static struct mount_mod *mount_bind = NULL;
  static int init_ctr = 0;

+/* Multiple access to hosts workaround */
+static pthread_mutex_t host_list_mutex = PTHREAD_MUTEX_INITIALIZER;
+
  int mount_init(void **context)
  {
  	/* Make sure we have the local mount method available */
@@ -190,7 +193,9 @@
  		      nfsoptions, nobind, nosymlink, ro);
  	}

+	pthread_mutex_lock(&host_list_mutex);
  	if (!parse_location(ap->logopt, &hosts, what, flags)) {
+        	pthread_mutex_unlock(&host_list_mutex);
  		info(ap->logopt, MODPREFIX "no hosts available");
  		return 1;
  	}
@@ -235,6 +240,7 @@

  dont_probe:
  	if (!hosts) {
+        	pthread_mutex_unlock(&host_list_mutex);
  		info(ap->logopt, MODPREFIX "no hosts available");
  		return 1;
  	}
@@ -264,6 +270,7 @@
  		char *estr = strerror_r(errno, buf, MAX_ERR_BUF);
  		error(ap->logopt,
  		      MODPREFIX "mkdir_path %s failed: %s", fullpath, estr);
+        	pthread_mutex_unlock(&host_list_mutex);
  		return 1;
  	}

@@ -300,6 +307,7 @@
  			/* Success - we're done */
  			if (!err) {
  				free_host_list(&hosts);
+                        	pthread_mutex_unlock(&host_list_mutex);
  				return 0;
  			}

@@ -325,6 +333,7 @@
  			if (!loc) {
  				char *estr = strerror_r(errno, buf, MAX_ERR_BUF);
  				error(ap->logopt, "malloc: %s", estr);
+                        	pthread_mutex_unlock(&host_list_mutex);
  				return 1;
  			}
  			if (this->addr->sa_family == AF_INET6) {
@@ -338,6 +347,7 @@
  			if (!loc) {
  				char *estr = strerror_r(errno, buf, MAX_ERR_BUF);
  				error(ap->logopt, "malloc: %s", estr);
+                        	pthread_mutex_unlock(&host_list_mutex);
  				return 1;
  			}
  			strcpy(loc, this->name);
@@ -365,6 +375,7 @@
  			info(ap->logopt, MODPREFIX "mounted %s on %s", loc, fullpath);
  			free(loc);
  			free_host_list(&hosts);
+                       	pthread_mutex_unlock(&host_list_mutex);
  			return 0;
  		}

@@ -374,6 +385,7 @@

  forced_fail:
  	free_host_list(&hosts);
+	pthread_mutex_unlock(&host_list_mutex);

  	/* If we get here we've failed to complete the mount */



-- 
Marcel de Boer
Test engineer, Service Routing R&D, IP/Optical Networks
Nokia, Antwerp, Belgium
--
To unsubscribe from this list: send the line "unsubscribe autofs" in

             reply	other threads:[~2016-04-08  7:55 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-08  7:55 Marcel De Boer [this message]
2016-04-08  8:54 ` Near-simultaneous automount of multiple directories fails Ian Kent
2016-04-08  9:46   ` Ian Kent
2016-04-08 11:37     ` Marcel De Boer
2016-04-10  2:34       ` Ian Kent

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LRH.2.11.1604080846130.26210@carrot.ant.ipd.priv \
    --to=marcel.de_boer@nokia.com \
    --cc=autofs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).