You might might also wanna consider using the mainline kernel instead of the older sunxi 3.4 kernel. It seems to improve performance noticably even if you don't tweak a lot.|
I'm running Kernel 4.0.0-rc7 at the moment. I haven't done extensive performance comparisons yet (and probably won't have time to do so in the next days either), but some quick tests showed that even without the tweaks applied to /etc/rc.local as described in this thread, the mainline kernel almost maxes out the performance of my USB harddrive (27-29MB/s reading and 30-31MB/s writing).
This is interesting, especially when you consider that the maximum CPU frequency with the latest mainline Kernel is limited to 960MHz at the moment (this is true for version 4.0-rc6 and higher, unless you change the limits in the device tree files yourself), whereas the older tests on a 3.4 kernel were done at 1008MHz. It's planned to reintroduce higher frequencies later again, so there is still headroom for slight improvement in this respect - and of course you can add the tweaks suggested by tkaiser on top of that.
As I said, I haven't done extensive testing. But the tests were done using the same rootfs. The only things that differed were, the Kernel version obviously, the U-Boot version (mainline U-Boot 2015.04-rc3) and that the older tweaks in /etc/rc.local were commented out (since I wasn't sure whether the device nodes/naming would be the same and I also wanted to see how the mainline Kernel behaves in general with regards to mult-threading and IRQ handling). The samba version and configuration were identical.
My main reason to switch to the mainline kernel is to have an easier way of updating the kernel without manually updating the 3.4 kernel sources and deal with merge conflicts, etc. So, it's a nice sideeffect to see that there are also performance improvements.