Tiny Workstation Win

May 23, 2009

One fact I forgot to mention yesterday: my workstation  is/was my EEE 901 netbook! Somehow I’ve achieved enough proficiency to write some code on it in Ubuntu! My pride went pretty high when I actually checked in my first version of s3arch.php.  It doesn’t yet set the public-acl, do any error checking, or calculate MD5’s but it is functional beyond that.  Here are some tech notes about the experience:

  • The position of the arrow keys and right-shift on the EEE layout is ridiculous and still caused me some mishaps.  I’m considering VIM even more, given that the arrow keys can’t (I think) move the cursor when you go to edit something.
  • Copy and pasting from Firefox 3.0.1.0 into Gedit 2.26.1 did something to the PHP, introduced some characters which caused my script to fail with weird errors.  I literally spent around a half hour tracking down a problem that didn’t exist (e.g. typing the code out myself worked).  So if you see “Unexpected ‘{‘” after cut&paste, consider deleting all that and typing it yourself.  I should have suspected that when the resulting paste removed all line breaks.
  • PHP CLI arguments are not automatically parsed into an associative array like $_GET or $_POST, instead you get the boring ol’ C-style $argv and $argc.  Also, you must specify a double-dash on the command-line before writing script-specific arguments, otherwise PHP tries to interpret them.
  • The S3 class I’m using uses CURL which does not come with PHP CLI by default.  I actually had gotten PHP CLI by installing PEAR, so I didn’t know where or what I had (thankfully I knew enough to run “$ sudo updatedb” and “$ locate php”).  Then I couldn’t figure out how to get php-curl on Linux, got confused by the pecl command which didn’t work for me.  I tried installing the entire PHP CLI through Synaptic Package Manager but it didn’t get me anywhere new.  Finally, some forum I found indicated I just needed to use a standard package install “$ sudo apt-get install php-curl”.  Duh
  • Gedit is okay but the word jumping is really different from Windows (notepad2 / textpad / visual studio) and that throws me off.  I never realized how much of a ctrl+left/right junky I am!

So today …

  • Install S3 Fox to remove the garbage I uploaded and check on my results.
  • Do file MD5 and set public-acl.
  • Print out timestamp of existing object if skipping.
  • Add error checking … start by having any error cause the whole thing to stop.

Later …

  • Back port source to ObremSDK where it belongs.
  • Add phpDoc comments to source, that way the license shows up properly in Google Code.
  • What style of comments does Google parse for JavaScript licenses, same as phpDoc?

Alright, let’s get to work!


S3Arch now in PHP!

May 22, 2009

Pain and weird feelings of an RSI type in my arms prevented my last writing of S3Arch, but that excuse no longer holds, and now I’m going to write it in PHP.  Right now.  This post will server as a light spec for it as well as another shameful reminder if I don’t do it.  The gist …

  • Launched from command-line, e.g. “php -f s3arch.php”.
  • CLI args: key, secret key, bucket, destination path.
  • public-acl set for anonymous HTTP GET access.
  • MD5 calculated for added defense against corruption.
  • Objects not created if they exist on the server.
  • Only files, not directories, are PUT as objects.

The source path is based on the current directory; I might change that to a parameter later but it works well for now.  Some things to figure out:

  • Reading arguments in a PHP script; probably a super global, I’m guessing $ARGS
  • Creating MD5 for a file; does the S3 class do this for me?
  • Use old S3 class for doing the job
  • Determining if object already exists: obviously the PUT will fail, but prior to that I need to do a list on everything in the bucket to reference.
  • XML parsing, based on above factor.  Should I just use strpos() for the time being?

Alright, let’s get started!  One last note: I’m going to forgo any kind of consistent commenting so I can just get through this!


Weekend Project: S3Arch

March 13, 2009

My tender typing appendages will brook no rest this weekend, I’d like to write and release something.  Here I will write a plan in order to accomplish the effect completely without wallowing in unremarkable dank corners of coding.  S3Arch initially began as a replacement for JungleDisk and thus included more functionality than my daily needs require.  Hardly anything of use came of said desire, not an unusual end for most of my hobbyist technological pursuits.

First, the three major points it must deliver:

  1. Single script / command-line name.  I drop to a prompt or just execute “s3arch” (no extension, no parameters) and it uploads new files from my local arch to S3.
  2. Configuration run once to specify S3 id/key, local path, and remote path (including bucket). Data is written to a “global” file accessible only to the current user, but not encrypted (I’ll get to that later).
  3. Traversal of local folders outputs their names and uploads any files not already on S3.  Existing files are left alone and not shown in the console.  At the end, like robocopy, it will print out how many files were uploaded, skipped, and how much approximately how many bytes.

Development Points …

  • Written in .NET on Windows but runnable on Ubuntu using Mono.  This shouldn’t be terribly difficult, I achieved it with my date-setting utility.
  • Using Visual Studio because I’m already familiar with its write/debug/test/deploy functionality.
  • Calculate MD5 hash of uploaded files.
  • Use HttpWebRequest directly rather than library (output must be a single executable!).  I know enough about S3 and .NET to make this happen.
  • Give files public-read ACL.
  • File name delimiter is forward-slash.
  • Content-type based on extension and only known ones allowed (otherwise file is skipped).

I hope that’s enough detail to help me avoid problems.  Immediate features after this will be configuration encryption, checking for text errors, and mirroring to FTP since I need the files on my web host.

Let me talk about text errors for a moment.  I currently have my arch folder persisted on my ReadyNAS at home as well as mirrored to neilstuff.com/arch/.  Recently I found out that a demo I had made was not working and further investigation revealed that the text files had all been corrupted.  However many line breaks there were, that many characters from the end of the file were repeated again.  Thus all XML/HTML was completely fubar.

My guess is that somewhere along the FTP path, I had file transfers set to AUTO, and it totally screwed up the line breaks since I switch between Linux and Windows (especially for hosting) all the bloody time. Thus I’d like a utility part of S3Arch to validate XML files, look for the final HTML tag, and possibly do some other sugar.  It would merely warn me and refuse to upload the files rather than uploading broken ones.