Dropbox .sparsebundle sync broken...?

Hey Michael,

  1. Hope you had a wonderful Christmas.

  2. I’m just spending my boxing day evening being converted to EF, which is rapidly winning me over having spent much of the last year pinging between different solutions, however…
    I seem to be experiencing this problem:

http://forums.getdropbox.com/topic.php?id=5363&replies=5#post-36400

having followed the instructions in the EF manual regarding using an encrypted img as the library.

I was wondering if you had any experience of this, which I realise is a Dropbox rather than an EF issue - and if you had any solutions you could help. Failing that, if Dropbox support does seem to be broken, then it might be a useful manual amend. I’m a little nervous about downloading developer tools and running shell commands… :slight_smile:

Anyways - have a great holiday…

cheers!

-i

Dropbox .sparsebundle sync broken…? Of course not…
blushes

Sorry… So that was entirely my mistake… .sparseimage, which is of course what you tell us to use in the manual works splendidly…

I hang my head in shame :slight_smile:

That’s what drinking over Xmas does for you…

thanks again!

-i

Dropbox has a bug where it damages sparse bundle disk image (.sparsebundle) files. The EagleFiler manual recommends creating an encrypted library in “sparse disk image” (.sparseimage) format instead if you want to sync it with Dropbox.

Is the damage simply a matter of losing the bundle bit? I’ve been using a sparse bundle for quite some time now, and as far as I can tell there is nothing corrupt with the image meta files or its storage files. Sure, I did have to manually set the bundle bit in Path Finder on the second computer, but once that was done it has appeared to run smoothly.

Based on this conversation, I’ve switched over to a sparse image for now, but would much prefer to stick with the sparse bundle format if possible. It’s a friendly format in terms of synchronisation and backup. One small change in a single text file doesn’t result in a bandwidth hogging 200mb upload (or a hundred+ copies in Time Machine of that size).

One thing that is a little freaky, I opened up one of the band files in a hex editor and changed a few bytes at random (in a duplicate image of course!). The thing mounted just fine. Presumably some file amongst the thousands is now damaged, but who knows where. I wouldn’t have thought that an encrypted stream could be damaged and still result in a partial data set.

I think it’s something like that, but it’s hard to know since Dropbox doesn’t document what their software does. The weird thing is that the bundle bit isn’t necessary for most other types of packages.

For backups, I agree. But isn’t Dropbox supposed to be smart about only uploading the changed parts of the file?

Another reason to use EagleFiler’s end-to-end file verification. :slight_smile:

Right, most package formats these days are handled real-time by the Finder and a central database. This can be tested by creating an RTF called TXT.rtf and then dropping it into a folder named test.rtfd. Poof, you get a valid RTFD file.

But it seems that sparse bundles still rely on the resource fork from what I can tell (they are in fact created with it set). Doing the above manual process and copying the internal components of an existing sparse bundle into a directory test.sparsebundle will simply result in a folder. You have to set the bundle bit in the resource fork before it will work—then it does so flawlessly. I would imagine the above description is precisely what happens when DropBox assembles a new sparse bundle on the second computer.

They have announced that they will be addressing resource forks in the future, so hopefully this will not be a permanent issue. I think, for the time being, your advice in the manual is best. Not everyone knows how to mess with resource forks, and there is a slight risk that DB is doing something potentially damaging. I don’t know how integrated the band files are of a sparse bundle package. I guess my byte test would suggest they are merely sequentially loaded 8mb chunks; if they were striped at all, damaging one would cause major errors.

But isn’t Dropbox supposed to be smart about only uploading the changed parts of the file?

Unfortunately, no. It is smart about what “changed” means. It doesn’t simply check modification times, rather it uses a database of hashes to determine file equality. But when it comes to byte level granularity that doesn’t seem to exist. I just tested this by changing one text file in my EF library, closing it, and then unmounting the image. It is now transferring the entire image file.

I think the other drawback to images (as opposed to bundles) is that OS X doesn’t update the disk as often (or perhaps at all?). I had it open for a while, and it wouldn’t upload to DropBox until unmounted. Perhaps Apple holds the working data for such a disk in a temporary location instead of updating the source image file as it is edited.

With sparse bundles, that is different. Each band file gets updated synchronously with user changes, which registers as a file level change with DropBox. So the bundle file gets continuously updated as you work in it.

Another reason to use EagleFiler’s end-to-end file verification. :slight_smile:

Ha. Without doubt!

Yes, except that the bundle bit is stored in the HFS+ metadata, not in the resource fork.

Too bad—it should be possible to do something more efficient, like rsync.

For any type of image, you need to unmount to be sure that the data are committed to disk completely and in a consistent state.

They claim to do this, actually. From their help page (which may only be visible with an account):

Does Dropbox always upload/download the entire file any time a change is made?

Dropbox tries to be as smart as possible about uploading for the best possible performance. Before transferring a file, we compare the new file to the previous version and only send the piece of the file that changed. This is called a “binary diff” and works on any file type. Dropbox compresses files before transferring them as well. You also never have to worry about Dropbox reuploading a file - it’s smart about this, too.

I’ve just started using a bunch of EagleFiler libraries on sparsebundles (which I chose over sparseimages to ease some of the initial pain of uploading), but I am very nervous about someday winding up with a conflicted copy of one of the bands.

For libraries containing things that I can be confident do not themselves contain xattrs I want to preserve, I’m planning to put EagleFiler libraries straight onto Dropbox. Just to clarify, though: Apart from the bundle setting on the .eflibrary (which I can repair if needed at the target using SetFile), is there anything else that EagleFiler itself stores in extended attributes or resource forks?

Incidentally, I just tested this, and the .eflibrary synced over as a bundle, no need to adjust it with SetFile. Perhaps they’ve already added support for EagleFiler.

Yes, that’s what I was referring to. I guess people will have to see for themselves how well it works.

If Dropbox really does do the above, then there shouldn’t be much difference in uploading speed, right? To avoid conflicts, make sure that you close the library and eject the image before opening it on another Mac.

  1. The .webloc format uses resource forks, so if you ask EagleFiler to create a bookmark it will create a file with a resource fork. EagleFiler redundantly stores the data in the data fork, as well, but some applications only support resource-based .webloc files.

  2. EagleFiler labels (which are stored as Finder labels) are not synced by Dropbox as far as I know.

  3. If you ask EagleFiler to copy its tags to the Spotlight comments, the Finder may store the comments in the extended attributes.

  4. If you use Skim, it will save notes in the extended attributes unless you use the .pdfd format.

There are several different levels of support. Unlike .sparsebundle, .eflibrary will show up as a bundle even without the bundle bit. So on the Mac side, Dropbox doesn’t need to support it. However, in order to show up as a single file in the Web interface, they would need to add support for EagleFiler.

I have had that happen, actually. In my experience it doesn’t impact the integrity of the image whatsoever. I loaded a test copy up with conflicted bands (duplicate copies of the band with the customary long DropBox disclaimer filename), and it mounted perfectly and validated out in EF. I closed it, unmounted, deleted the conflicted bands, and then tried it again with identical positive results.

I’m guessing one of the top-level description files identifies bands and their loading order, so you could technically place a bunch of non-band files in the band directory with no ill affect since the loading parameters only state the band files to load. Now, changing the name or deleting a band could cause issues! But unless you do a partial DB synch that shouldn’t be an issue.

Absolutely agree. This is true of any disk file system, loopback device or no. However it is very clear to me that Apple is being much more aggressive in keeping the disk copy of the image up to date with the loopback device on a constant basis. The moment I save anything or even open EF off of a sparse bundle image, writes get committed to the disk. This results in a near constant level of activity within DropBox while working with an open sparse bundle image.

So yes, nothing on a disk should be considered complete until it is unmounted, but I think from a backup and sync stance, sparse bundles are much safer than disk images. An image left open for days will have, as far as I can tell, zero writes saved to it in the case of a catastrophic shutdown (full power out or something). A sparse bundle file might have gaps, but I’ve already demonstrated that it is resistant to band damage. Localised damage will not destroy the integrity of the rest of the image.

It is no substitute for regular unmounting—but as a working disk image for how we are using it (something the original disk image format was probably never designed for), it is dramatically safer the longer it is left open.

Just a quick note and apologies for a slight tangent.

I had several issues with Dropbox corrupting my (test) EagleFiler libraries when they were either sparsebundles or sparseimages. I started using the free Unix/Mac/Windows tool called Unison (no, not the usenet reader) to synchronize the files on multiple computers. I’ve never had a corruption issue and have even “done the naughty”–which is to synchronize while the program was open on another computer.

Here is the link and I can post more info in my own thread (sorry again…) if someone is interested: http://www.cis.upenn.edu/~bcpierce/unison/

I now even have a script that keeps me from “doing the naughty.” It searches for open instances of EagleFiler on remote computers and will close them.

EDIT: Oops, nevermind. Looks like this has been covered before: http://c-command.com/forums/showthread.php?t=61

My apologies.

Yes, since I posted my previous note, I’ve had band conflicts as well (even when the image was open on only one machine). I suspect that it might have to do with the same band getting changed twice in rapid enough succession that the second change propagates before the first, or before the first is finished downloading.

This wouldn’t be a problem if the conflicted band on the image I’m using is treated at the master and the other computer’s version were named with the “conflicted copy” name – that must have been what happened to you. For me, the band that I was actually using was the one that got renamed, so I had to lickety-split move the out-of-date band out of the image and rename my real one to remove the “conflicted copy” moniker. I think I managed it, but I would have been hosed if something had wanted to write to that portion of the image in that time it was conflicted.

I have set up GeekTool to constantly do a “find ~/Dropbox | grep conflicted” and put the results up on my screen in big red letters so that I know within 20 seconds if I have a conflicted file, and I am pretty sure it saved me on this occasion to be able to react immediately.

Thanks for enumerating these. I could have sworn Skim had the option to just save a .skim file next to the PDF, but I can’t find it, so perhaps I imagined it.

By the way, I hadn’t noticed that this had been posted before just now searching the site, but the script to convert pdf to pdfd looks great. I suspect I’ll start using it a lot.

This sounds like a bug in Dropbox. It should be able to at least sync a consistent version of the folder (package). I’m glad that you’ve found a way to deal with this, but I suggest that people who don’t want to worry about conflicts should use the .sparseimage format](EagleFiler Manual: How can I access my library from multiple Macs?).