cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Announcements
Want to know what we learned at IBC? Check out our learnings on media, remote working and more right here.

Dropbox tips & tricks

Learn how to get the most out of Dropbox with other users like you.

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Re: How to fix being stuck on Indexing forever loop of death on multi million file DBs after db cras

How to fix being stuck on Indexing forever loop of death on multi million file DBs after db crash

XionicFire
Collaborator | Level 9

Ok so I've used Dropbox for almost as long as its existed and recently due to frustration with the never finishing indexing bug I was forced to find out why this kept happening so I could prevent it.

 

Bear with me on this long post but trust me its worth it, what I found was mind blowing and game changing.

 

So our business Dropbox is more than 9 million files strong, I've noticed REALISTICALLY any machine handling over 2 million will just enter an indexing loop at some point from which it will never recover, after its happened 5 times in the last week I was pissed enough to that I decided I was going to find out why this is happening, I know I'm pushing the limits but we've had machines with 2.5m files running fine for years, why some work fine and some don't was a mystery, one I was determined to find out.

 

When you add stuff to your Dropbox, Dropbox has to index it so  it can know what to do with it.

If you add "too much stuff"  (copying 200,000 files of small size in one go, coders know what I'm talking about)

or

do it "too fast" (changing access permissions on 1.5 million files located within Dropbox in one go in less than 5 minutes)

on a computer with too many files (1-2+m)

this causes Dropbox to start indexing them all at once causing the system to slow to a crawl

however if you don't let Dropbox finish before doing something else (like adding more files or using the computer for other tasks)

or

someone else adds a bunch of files on another machine

its almost for sure going to cause Dropbox to crash and restart during this process (it happens quite frequently),

this crashing and restarting triggers a full reindexing of the ENTIRE database, ALL files, and since the machine is already trying to download or upload some of the new files while trying to reindex the current millions of files, doing both at once overtaxes it, which causes it to crash, and were back to square one with the infinite indexing crash loop.

 

This kept happening to us all the time, the only solution was to unlink and relink the Dropbox account so all pending changes were lost, we got a bunch of conflicted copies and spent days sorting out the mess.

 

So I figured I needed to see what was going on with Dropbox, what it was doing when it was "indexing" to find out what was causing the crashes.

 

So after looking for a while I found using Microsoft's sysinternals process monitor (https://docs.microsoft.com/en-us/sysinternals/downloads/procmon) configured in a certain way allowed me to look at EXACTLY what dropbox was doing and I found out all I just told you and another very important piece of info.

(if you want to see how awesome it looks in action check out THIS video:)

 

When you click PAUSE FILE SYNCING you would think Dropbox pauses and ceases all operations, but you are wrong!, it pauses all incoming and outgoing transfers but any INDEXING tasks keep going!

 

This is an absolute game changer! now if I see a machine that says "Indexing" for a long time, I turn on the process monitor, hit pause on the file syncing and watch as the machine does the indexing at super high speed (5-10 times faster than doing it while downloading), it usually finishes doing the full file reindexing in a couple of minutes once its done I can hit back resume and keep going, I've never had the app crash while doing this "offline" or "paused" indexing, thus avoiding the inevitable crash, and reindex loop.

 

I have been successful in recovering 4 machines from the indexing loop of death using this method, where before I was screwed and had to eat the duplicate files and cleanup for a week and a TON of annoyed users in the office.

 

Basically if your machine is taking too long indexing or is stuck indexing after a crash just hit "pause for 1 hour" and forget about it, it will still be working on the indexing in the background, and when it restarts it should have finished the re-index avoiding a crash when trying to download/upload the new files.

 

Id wish Dropbox would have told us this, I never expected it to keep indexing while paused, I assumed pause was PAUSE, as in, cease all operations, it would have saved me so many headaches.

 

All they need to do now is let us have a "log viewer" or something so we can tell when its done doing its thing and we can hit resume, also show us, even when in pause, when its indexing and when its not, so when its done we know we can restart it safely, or the better yet, set it to where if Dropbox has to index a large volume of files (say over 100), it will automatically pause all other disk operations until the indexing is complete, then restart the downloads, trying to do both does not work, i know you want it to but it just doesn't, and just causes the whole thing to explode non stop in a loop of death, maybe enable this on a setting somewhere? or auto enable it on machines with over 500k files? something has to be able to be done.

 

TLDR:

If your Dropbox is stuck indexing, hit pause 30 mins, and let it do its thing until its done, it will keep on doing it even when paused, you wont know if its doing anything or working unless you use procmon, but its working, and try avoid using the hard drive or the machine until its done, (usually less than 30 mins), and your indexing/crashing problem will be fixed. 

 

Message to Dropbox:

Dear Dropbox, Please give us a way to view this info without having to resort to third party programs, this way we can help auto troubleshoot our Dropbox issues and take a lot of load off your customer service guys.

Something like: Enabling a setting  somewhere saying "activate/enable troubleshoot/server mode" or something that allows us to turn on an always shown (ALWAYS, NOT ONLY WHEN MOUSE OVER, BUT ALWAYS!!!) 3 tab little window,  containing:

 

Indexing files. (with a current list of the exact files being indexed and their speed (x files per sec)/paths)

Downloading Files.(with a current list of the exact files being downloaded and their speed/paths)

Uploading Files (with a current list of the exact files being uploaded and their speed/paths)

 

There's another issue with slow uploads due to Dropbox connections stuck in a "stagnant state" (force closing the TCP socket connection using netmon restarts the download/upload and speed goes back up again) but that's another problem for another time.

 

I hope this was helpful to some other sysadmin and sorry for the long message but it needed explaining.

12 Replies 12

ArcticAnna
Explorer | Level 3

Thank you! 

I KEEP on having this problem. 

Apple wiped my mac in a fix so I started added folders one by one (4.3TB of mostly small files)

As you can imagine, this took me weeks

And then it just stalled.

I kept thinking since it was just 3.8GB trying to change that it would sort itself out and as I waited and watched, of course I kept fixin  things as I went

But no go.

DB support---zero support...other than to unlink..which I clearly didn't want to do to go back to ground zero.

But that is now where I find myself. I didn't read this in time, unlinked, relinked, tried to link just a few folders to start off with...and we are just indexing. Oh boy. I am also thinking about looking elsewhere but I hate google and a few years ago, at least, Microsoft had issues syncing between mac and pc...

XionicFire
Collaborator | Level 9

Well it happened again... out of nowhere with no major changes, syncing forever with no reason why, just never finishing I tried my usual tricks and nothing

 

Using resmon to see if any files where open by some app that were holding up the sync?

Nope

 

Using resmon to check for disk activity to see what files were being written or read that could be causing the problem?

Nope

 

Checking the folders to see what folder had the "sync" blue icon and figured one huge file was syncing?

Nope

 

Suspend and wait for re-indexing to do a re-index and find out if that fixes it?

Nope

 

I was running out of ideas and I came across this post:

https://www.dropboxforum.com/t5/Apps-and-Installations/Desktop-client-stuck-a-quot-Syncing-quot/td-p...

 

And I figured could it be that easy?

so I went to the Dropbox\.dropbox.cache\old_files\ folder and sure enough a 15 gig file was there, I deleted it and like magic the problem plaguing me for more than a week fixed itself.

 

So if you tried my previous solutions and the sync issue still persists try this one

 

For those that care:

 

The reason this happens is every time you right click and select a file or folder to convert to "online-only" dropbox doesn't just automatically delete it, it moves it to a temporary location for further processing.

 

Basically what dropbox does when you do this is it moves the file immediately to Dropbox\.dropbox.cache\old_files\ under a random generic number and puts an online only link where the file was before as a placeholder, then once it does this it starts hashing out the file, meaning it reads it all completely, then creates a hash of it, then compares it to the hash on the server, if the hashes match, then it deletes it for real, if it doesn't match, it uploads the file and then deletes it, this ensures the online only'ed file is always the most up to date version.

 

This has two benefits:

 

The first, as described above if the file were to be different it would correct the situation by uploading it.

 

Second, if you accidentally hit online only, on these files, you have until dropbox does this process to "undo" your mistake without having to redownload the files again, so its kinda useful.

 

I'm guessing what happened is something borked up on the dropbox hashing routine, making it never be able to hash the file, since it never hashed it couldn't be deleted, since it was not deleted the sync icon just stayed there forever.

 

Deleting the file causes dropbox to "error" out saying "file not found" and skipping the hashing procedure, just assuming the last file he has on the server is the newest one, and leaving it at that.

 

This causes the sync to finally finish

 

So now you know, if it happens to you, try this, it might fix it. 

Graham
Community Manager

Hi @XionicFire,

 

First of all, I just wanted to say that I can really feel your frustration with this issue happening to you again, hopefully this will be the last time!

 

Thank you so much for letting us know this fix worked for you, I can see you've gone to great effort with troubleshooting and I'm delighted you eventually landed on a Community thread that was able to solve your problem. Hopefully, if any one else is effected in a similar way to you, this will help them too.

 

Thanks again, and have a great day!

Graham

Need more support?