mardi 29 juillet 2008

libgpod callout improvements

The other day, I mentioned that libgpod hal callout was setting volume.label to the iPod name to get a nicer name displayed for iPods in Nautilus (among other apps). What I didn't say is that I wasn't really sure that overriding that property with something that has nothing to do with the actual filesystem label was such a good idea.

And it wasn't. After asking David Zeuthen about that on IRC, he kindly told me about info.desktop.name which was added explicitly for that purpose. And I also learnt about info.desktop.icon which is to be preferred to info.icon_name. So I made these 2 changes and pushed them to the podsleuth branch of my git repository.

While I was at it, I worked on the few things that are still in the way to a libgpod 0.7 release, ie I cleaned up the exported symbols to make sure what we export make sense from an API point of view, and I added some missing API doc and made a few fixes in the existing one (some functions were renamed and the API doc wasn't properly updated).


On an unrelated note, I'm glad to see that some people find this blog worth some comments, thanks ;)

lundi 28 juillet 2008

And my other project is...

After describing my latest work on Rhythmbox yesterday, here's what I did on libgpod in the last month. libgpod is a cross-platform library used by many different projects (amarok, gtkpod, rhythmbox, songbird to name a few) to access and modify your ipod content.

Extensive SysInfoExtended parsing

As has been known for a while now, the iPod can be queried about its capabilities using SCSI commands and returns XML data describing the iPod (serial number, firmware version, ...) and what it can do can do (podcast support, video formats supported, image formats that it knows how to display, ...). When we released libgpod 0.6, we introduced a hal callout to send the appropriate SCSI query to the iPod and to dump the returned XML data to a file that we named SysInfoExtended. Normal users aren't guaranteed to be able to send raw SCSI commands to a device, hence the use of a hal callout and the dumping of the information to a regular file. However, in libgpod 0.6, we only had a very basic parser for that file which only knew how to read the only SysInfoExtended field we needed. Most of the information about the iPod capabilities was hard-coded into per-ipod model tables, and libgpod had to be told the iPod model before being able to (for example) being able to write artwork to an iPod.

For the next release of libgpod, I decided that we had to be able to use the information from SysInfoExtended to its fullest. I started by writing a generic plist (which is the XML subset SysInfoExtended is in) to GValue parser using libxml instead of GMarkup. Then, I extracted the data I was interested in from the GValue collection the parser gave me to a nice C struct. To make the addition of new fields easy, most of the work is driven from a table indicating the field name in the plist file, the type we want to assign that field data and the offset we want to put that data in the resulting struct. Modifying that table and the struct definition are the only things that need to be done if we want to read additionnal fields from SysInfoExtended.

With that being done , I had everything I needed to have libgpod use the information provided by the device to write artwork to the iPod instead of relying on hard-coded tables. Some refactoring was needed to make it possible to use the artwork data from the iPod (there were some assumptions here and there that the formats supported by the iPod were known at compile time) but now that it's done, the code feels much more natural and maintainable than before.

Getting the iPod model from SysInfoExtended

With the aforementioned work, writing artwork to the iPod has been made much more flexible, but libgpod was still unable to automatically guess the iPod model/color/... from the device without asking the user. This is due to the fact that to do that, libgpod relied on the iPod "ModelNum" which used to be present in a file on the iPod filesystem but for quite some time now, the only way to get that model number is to read it on the iPod box, which is not really easy to do from software :)

But for all recent iPod models, there's another way to guess the iPod model, this is by parsing the iPod serial number. And this serial number is precisely one of the things that we can read from SysInfoExtended! So all we had to do to be able to automatically detect the model/color/.. of a plugged iPod was to properly parse the iPod serial number and to infer the iPod physical features from that serial number, just as what podsleuth does with that table.

Podsleuth

Given that libgpod had already installed a hal callout, and after all the work done to parse the SysInfoExtended data, I realized that libgpod had gathered all the pieces to build a podsleuth clone, so I decided to try to write one just for fun and to test the new API added to libgpod in real-world situations. This led to the work which can be found in the podsleuth branch of my libgpod git repository.

Even though I haven't tested it against banshee, I compared the properties exported by podsleuth and by this experimental stuff, so this code can probably be used as a drop-in replacement to podsleuth. Writing it also made me realize that podsleuth doesn't export enough information about artwork formats compared to what libgpod needs. I'm also not a big fan of how podsleuth exposes the artwork formats: it parses the XML to get the artwork data to immediatly serialize it again to a string. It's then up to the app using podsleuth to parse that string (again) to get the artwork formats supported by the ipod.

Anyway, since I now have hacked this nice tool, it's now up to me to experiment a bit with all of that and to make suggestions as to how things could be improved :) By the way, I already used that code to see how iPod integration with the desktop could be improved: it sets the volume.label HAL property to the name of the iPod as extracted from the iPod database which results in a nicer name for the iPod on your Nautilus desktop.

dimanche 27 juillet 2008

blogo ergo sum

After being kicked again and again (which hurts, the guy is a black belt in karate) by Dodji who wanted me to blog, here is a first post.

Album Artist support

These last days, I've been trying to get back to Rhythmbox development to scratch a few itches of my own.
First, I've looked at how Rhythmbox could handle compilations, ie albums containing tracks by different artists. Currently, if the album has 12 different artists, these 12 artists will appear separately in the artist list which can quickly create a big mess. I wrote a basic patch to make it possible to set an "album artist" for such albums and to use that instead of the multiple different artists in the artist list. I had to experiment a bit with various approaches, but in the end, the patch is surprisingly small.


Song UIDs

Then, I wanted Rhythmbox to be able to provide UIDs for the songs in its database. What I call an UID is some kind of identifier that is unique to a song and that can be generated by only looking at the song data. This can be useful for various things : iPod (or whatever your portable media player of choice is) synchronization, associating user data (rating, play count, ...) to a song which persists even if the user does a mv of the song from the shell, ... I learnt after doing that work that Charlotte had been looking for such a feature in Rhythmbox for her nice Rhythmbox SOC which was good news :)

To generate that UID, I chose to hash the song title, artist, album (read from the tags of the song) with the first 8kB of data of the file (actually, this hashing scheme was heavily inspired by what Amarok does). I'm not sure yet if this is the best way to uniquely identify a song, but we'll only know after people try to use it. Before you ask, I thought about using musicbrainz/musicDNS acoustic fingerprints but as far as I know, none of those fingerprints can be generated using free software end to end, there's always some closed source webservice that must be queried to get a fingerprint from a few parameters that were generated by analyzing the song audio data.

Once again, this feature was straightforward to implement.

The main issue I had was to debug the UID generation. Indeed, metadata reading (where I chose to add the UID generation) is done by a separate process which communicates with Rhythmbox through dbus. Reading metadata is basically equivalent to feeding random data to the tag reading library, so it's really hard to guarantee the library won't crash or hang in some corner cases. Using that external process allows Rhythmbox not to crash or hang if such an event should occur during metadata reading.

But this external process also makes debugging harder: it's short lived, spawned on-demand and run in the background (ie it's not possible to print stuff to stderr or stdout). So moch's help was really welcome since he explained me how to be able to run that metadata helper process by hand and to tell rhythmbox to use it. It's really simple, all you have to do is to (optionally) increase ATTENTION_SPAN in metadata/rb-metadata-dbus-service.c so that the helper stays alive longer (by default, it dies after 30 seconds of inactivity).
Then, you can run rhythmbox-metadata in nemiver (or in your favourite debugger), this will output a line like :

unix:abstract=/tmp/dbus-vXSVpHsnpL,guid=ba4e19b37904dba3bb1fc2214889d478

If you now set the RB_DBUS_METADATA_ADDRESS environment variable to that value before running Rhythmbox, then Rhythmbox will use the metadata helper you just launched in your debugger. Now all that is left to do is debugging!

The result of this work can be found in the uid branch of my Rhythmbox git tree. It still needs some polishing, but the basics should already be working (including automatically updating your database to add UIDs when you first run it).