Softcore software development
It's all about the cycles
  • Home
  • About

Archive for May, 2008

The many ways around a problem

programming 1 Comment »

I came across a bug in the zipfile python module yesterday that I had to fix today. The problem occurs when you try to create a ZipFile object and passing it a corrupt zip file. It doesn’t handle it gracefully like returning None or throwing an exception. Rather it heads into an infinite loop.

This is rather unfortunate for me. How would I get around this problem? The first thing I did was check for an updated python. Which there was a minor version upgrade. I found the changelog (why do they hide these things?) and noticed a few bugs resolved with the zipfile module. So I installed. Unfortunately, this didn’t solve my problem.

I managed to find a bug number in the python bug tracking software about people having similar problems. There was a patch, but hasn’t landed. I downloaded the latest stable version, but the patch wouldn’t go through. So I had to cvs checkout trunk and apply it. Once installed, I tried it and it worked! Success.

However, it broke other library I was using (PyXML). Unfortunate for me, the recent trunk build didn’t seem to fair any better.

At this point, I wasn’t in the mood for debugging. I had a few options at my disposal :

  1. Ignore this particular file
  2. Suck it up and debug it.
  3. Find a whacky work-around

Option 1 isn’t an option. Option 2 I tried for a fair while, but nothing worked. So Option 3 was my only option!

I tried using a lower level library to see if I can fix the problem (zlib library), but that didn’t work well at all.

I finally thought I had no choice but to initiate a thread to try and unzip the xpi, and if it took longer than 10 seconds, to kill the thread somehow. While seriously looking into this, and fighting the temptation to take tequelia shots at work. I came across signals (which I thought I could use to send to the thread. I’m so naive). It turns out, you can throw a signal after a specific number of seconds and it throws the SIGALRM. This was exactly what I needed without the extra complexity. The example provided was almost exactly what I did too! Here is my solution to the problem :

		signal.signal(signal.SIGALRM, signal_handler)
		signal.alarm(10)
		try:
			zippy = zipfile.ZipFile(io, 'r')
			signal.alarm(0)
		except:
			print "\tZipFile Timeout"
			continue

Maybe python isn’t just for programming sissies after all.


May 28th, 2008 |

Tags: intern, python, seneca




It’s hot down here

addons 2 Comments »

So I have been spending a few hours here and there since starting my internship doing this side project. It’s an extension that watches the tinderbox tree and reports back what is burning, and the status of certain tinderbox’n that your interested in.

There were a few goals I had in this release. The main objective however, is to help avoid making trips to the tinderbox page (because it’s large, and slow). For me at least, I am only concerned about Linux tinderboxes being red so I can checkout :) . But others might have different needs. So I generally tried to include everything I can. But I could have made a mess of things.

I should mention that you should have a reasonably fast connection (ie. not 56K modem). Even GoogleWiFi was able to reasonably download the json and bonsai xml files that I needed to get things working. Most developers should be fine.

I mainly tried to squeeze as much information as possible into two popup menus, making use of the tooltip to show more information then would be otherwise possible. I also show what menuitems are links by giving them an icon. But it has been a bit overdone.

Anyways, here are some images to show you what you can expect.

When loading, you’ll be amused by the animated png throbber that shows up on the statusbar

Before it can be useful, you have to set it up

The options menu shows you what tinderboxes are available to be watch. For now, you will only see Firefox tinderbox. This was mostly because I was less interested in the other trees. Timeout refers to how long the extension should wait before updating. You want to keep this value to be reasonable.

The statusbar icon will show you the worst state of any of your chosen tinderbox trees.

A left click shows tinderboxes and their status

A right click shows bonsai information. From bottom to top, it shows most recent check-ins. Hovering over menuitems gives you the time/date of check-in as well the check-in message.

Sub menus show a component::file display. The reason for this was because showing the full path took too much room, so I wanted to show what I thought would give you enough information so you can reasonably take an educated guess as to what was being changed. Hovering, of course, shows you the full path and new version.

Bwahaha, the extension lives here in this insecure site until I get it up on AMO. You can also fetch the source from repository.cesaroliveira.net. Any criticisms (hopefully constructive) can be emailed. In the meantime, enjoy this most beta software :)


May 27th, 2008 |

Tags: personal, seneca, tinderbox




Taming the beast from within

addons No Comments »

The next 5 paragraphs are me whining. To get to the real import stuff, start at paragraph 6

So I have been pouring two weeks into WildOn, which is finding out how many addons exist out there in the wild. But before I start unleashing web crawlers on the web causing havoc and chaos, it will be helpful if we could compare what’s out there with what we know. What we know is everything from AMO, so we start there. The point of this extra work is to have some results, so that when we release a web crawler on AMO and tell it to find all the extensions, we’ll have something to compare it’s results to.

Actually, even this was a bit confusing. AMO provides an API to view its addons (well actually, two versions of the API, with the older being slightly more useful). But that information was eventually scrapped for several reasons. The main one being is that there is a lot of information on AMO that isn’t on the extension itself (such as, What operating systems are supported, and is the addon a theme or an extension. While the former has been supported since Firefox 2, I have rarely seen it used, the latter is completely optional). This makes any sort of conclusion inconclusive because you don’t have enough information.

Then there was the problem of having too much information in the database. To the point where ~4000 addons took up ~1.8gigs of information. To an sqlite datbase, this can get slow. When you try some queries, such as the number of extensions that support the ‘jp-JP’ locale, this can get to be even more intensive process as you build a table that comprises of tens of thousands of rows (one row for each guid/locale combination). The reason for this is because older versions where being included in the same table as the newest version of the addon. Some addons had something like 50+ different versions. The solution seemed to be to move old extensions to a different tables. SQL queries seem to go much faster.

Another issue that makes me loathe RDF is install.rdf. I strongly disagree with the use of rdf for anything :) It becomes difficult to parse with a regular xml parser (there are a few python rdf libraries out there. But rdflib, the most promising, seems to like not working and not having good examples. Only sheppy can save them now, but he’s working on mdc). Especially with rdf:resource, which I am completely ignoring right now. So it seems that AMO editors like to get creative with install.rdf, which has caused problems for me (eg. I can not rely on targetPlatform. Some extensions actually have their targetPlatoform in the Description tag. I know this because one of the extensions had Firefox’s GUID :( ). Also, some other quirks like having the id as an attribute of Description instead of a new tag. All things that are probably perfectly valid, but make my life significantly more difficult.

YAP was that many early extensions did not use chrome.manifest. And some newer ones don’t. So to look up locale information, they were either in install.rdf or contents.rdf. This makes me (and by extension, kittens and baby Jesus) sad. I don’t have a fix for this yet.

But enough about problems, what about SUCCESS!?

Ok. So I managed to get a local copy of every extension that is on AMO. Since parsing an analyzing and writing to persistent storage takes a long time, I decided to save myself some trouble and just do the first 2500 extensions (out of the ~7K folders that I have).

Of the 2500 ‘extensions numbers’, 1630 where successfully analyzed. This is mainly because extension numbers don’t increment perfectly (eg. there is no addon #1. The first one starts at #4. Only about 100 addons failed to parse, giving me a success rate of 94%. Some extensions had quirks in them (eg. bad RDF) that were either invalid or I couldn’t figure them out.

Out of the 1630 extensions, this is what xulrunner-like applications they supported :

And Here are the approximate numbers :

Name Count
Prism/Webrunner 2
Songbird (old) 2
Instant 1
Midbrowser 3
toolkit (any gecko 1.9 application) 7
eMusic DLM 12
Seamonkey (broken GUID) 2
Nvu 11
Sunbird 16
Thunderbird 256
Songbird 13
Seamonkey 101
Flock 159
Netscape Navigator 68
Mozilla Suite 166
Firefox 1466

This looks ok so far. One expects a few non-Firefox extensions. The Thunderbird numbers seem a little low. Reminder that this is only ~33% of the total addons.

Locales seem to be a bigger mess, as there are many early extensions that don’t use chrome.manifest, so I decided to skip it, but now realize I have to fix it. Out of 1630 addons, only 464 addons had chrome.manifest files that I was able to read. But here is the breakdown anyways :

Number of locales : 173 (en, en-US, en-GB are all considered different locales). There are some invalid locales. For example, Xultris has an invalid locale called xultrisLocale. This can be fixed with a regex expression, but anyways.

Locale Supported Extensions
en-US 439
sv-SE 57
it-IT 190
de-DE 189
pl-PL 137
es-ES 181
fi-FI 64
ru-RU 129
nl-NL 145
pt-BR 162
fr-FR 204
ja-JP 124
zh-CN 126
zh-TW 114
ko-KR 86
cs-CZ 90
en-GB 29
es-AR 54
mn-MN 4
ro-RO 30
sk-SK 118
ca-AD 56
el-GR 38
pt-PT 49
ar 18
uk-UA 61
sr-YU 12
bg-BG 28
hu-HU 84
hr-HR 64
da-DK 92
nb-NO 32
sl-SI 23
lt-LT 21
tr-TR 72
ar-TN 0
de-AT 10
he-IL 41
el 6
ja-JA 1
mk-MK 10
be-BY 25
sq-AL 8
en 19
de 22
es 7
km-KH 6
th-TH 14
it 13
az-AZ 2
id-ID 8
fy-NL 13
fa-IR 33
af-ZA 8
ar-SA 4
cy-GB 0
gl-ES 11
ms-MY 3
ar-JO 1
es-CH 0
es-CL 6
am-HY 1
hi-IN 5
vi-VN 4
en-AU 5
cz-CZ 1
he 1
fa 1
ur 1
ja 18
fr 23
nl 9
pl 9
ru 14
sk 15
eu-EU 1
de-CH 5
ko 4
hr 1
sr-Yu 3
ga-IE 7
pt-PR 0
tr 3
cs 4
hu 7
en-BZ 3
en-CA 4
en-IE 3
en-JM 3
en-NZ 3
en-PH 3
en-TT 3
en-ZA 3
en-ZW 3
es-BO 1
es-CO 1
es-CR 1
es-DO 1
es-EC 1
es-SV 1
es-GT 1
es-HN 1
es-NI 1
es-PA 1
es-PY 1
es-PE 1
es-PR 1
es-MX 2
es-UY 1
es-VE 1
fr-BE 2
fr-CA 2
fr-CH 2
fr-LU 2
fr-MC 2
eu-ES 3
zw-TH 0
da-DA 1
be 1
eo 1
ca 7
pt 2
ar-DZ 1
jp-JP 0
et-EE 2
nl-BE 1
eu 1
en-EN 0
sr-CS 1
ua-UA 1
no-NO 1
mn-MK 0
sl-SL 2
is 2
nn-NO 1
lv-LV 0
uk-AU 1
ja-JP-mac 2
ml-IN 1
wa-BE 1
is-IS 2
ca-ES 0
sv 1
fr-fR 0
da 7
fi 2
ro 1
ar-LB 0
sr-RS 3
en-UK 2
es-US 1
de-LI 1
de-LU 1
ko-Kr 1
no 1
zh 1
bg 1
tl 1
sr 1
sq 1
sl 2
xultrisLocale 1
ca-CD 1
se-SV 1
mn 0
mk 1
pa-IN 0
ka 1
lt 1
uk 2
ar-AR 1
he-HL 0
convertLocale 1

Some locales will have 0 supported extensions. This is because We are only counting the most up-to-date extension, and not counting previous versions which may have supported that locale. While doing a graph for each locale would be unwise, a much wiser choice would be to break it down into language.

So which languages are best supported?

Language Extensions supported
en 462
sv 58
it 202
de 212
pl 145
es 192
fi 66
ru 143
nl 154
pt 165
fr 225
ja 142
zh 148
ko 91
cs 94
mn 4
ro 31
sk 133
ca 64
el 44
ar 21
uk 64
sr 19
bg 29
hu 91
hr 65
da 100
nb 32
sl 27
lt 22
tr 75
he 42
mk 11
be 26
sq 9
km 6
th 14
az 2
id 8
fy 13
fa 34
af 8
cy 0
gl 11
ms 3
am 1
hi 5
vi 4
cz 1
ur 1
eu 5
ga 7
zw 0
eo 1
jp 0
et 2
ua 1
no 2
is 4
nn 1
lv 0
ml 1
wa 1
tl 1
xultrisLocale 1
se 1
pa 0
ka 1
convertLocale 1

And here is the obligatory graph for those numerically challenged by high school mathematics teachers.

top 10 languages for 464 analyzed extensions

So what does this lead to? First I need to fix locales. We need to get the vast majority of them. Next, I want to profile all the extensions and not just the first 2500. And then, I want to start looking at web crawlers and learning how to crawl a simple website before unleashing a monster on AMO.


May 22nd, 2008 |

Tags: intern, seneca, wildon




A bonsai forest fire

personal No Comments »

I’ve spent a few hours working with, and trying to figure out how to best incorporate some of the bonsai features into the json output of tinderbox. Bonsai output seems to be restricted to HTML only, at least initially. Searching devmo proved fruitless, so I asked in #developers where Mossop had a program he made a while back that parsed the HTML and found what he needed. While talking, someone (I can’t recall, and I apologize) mentioned that bonsai has XML output and pointed to a buildbot script, and after some analyzing came to the part I was looking for. It seems that that any bonsai query can output to XML by adding “&xml=1″ to the end of the url string.

Rock’n. I got a few more things out of the way, and hopefully have something out soon!


May 19th, 2008 |

Tags: personal, tinderbox




Taking on the WildOn(es)

addons 1 Comment »

I started writing this a week and a half ago, but just finished it today.

First day at interning at Mozilla. I finally found out what I get to do this summer. I got the OK to blog about it, because you know how secret them Mozilla folks are about their secret in-house project (ie. What is this guy up to? ;) ).

The actual wiki page was apparently out in the open, but no-one heard about it. It’s called WildOnAddons. While a new name is, IMO, mandatory, it’s actually a pretty neat idea. There are many great extensions such as Ted’s Extension Developer’s Extension that aren’t hosted on AMO. Some other extensions are hosted on AMO, but frequently have updates much sooner on their website before it goes public.

Sometimes, extensions come in bundled with packages such as Norton and McAfeee. Google Notebook is one of many Google Labs extension hosted on their own server.

In short, they’re hosted everywhere. But that presents a problem, how many are out there and can find and index them?

This is actually a lot harder then going on google and typing filetype:xpi, because according to those results, AMO only has 78 extensions. In fact, there are several repositories of addons each catering to a different crowd (yes, we are counting all addons). While I don’t think that AMO can satisfy everyone all the time. It might help us figure out how many extensions are out there and how many are hosted on our servers. Actually figuring this out will take a lot of work, and not as straight-forward as it sounds (ie. All of AMO’s sandboxed addons require authentication, so a web crawler would have to know about it if we were crawling through the web), but it will be worth it in the end.

I’ll keep blogging about it under wildon tag RSS feed if your interested on how progress goes.


May 15th, 2008 |

Tags: intern, mozilla, seneca, wildon




Well, I have to tease a little

personal 1 Comment »

You probably guessed by my last post what I was working on. Well, here is a screenshot of what it’s starting to look like (this is an old tinderbox log, and options obviously isn’t yet synced with the UI just yet.

Tinderbox icon in the status bar, and options window showing
Click for a larger image

There is already an extension that does something similar, hidden within the tinderbox page. But I’m still happier with this result, since anything that saves a trip to the tinderbox page is a nice thing to have!

I have also been debating between the license I want to give this program. I am basically limited to around three choices, MIT/GPLv2/{Beer|Donation|Charity}ware. Each with its own unique traits. I can’t really see anyone commercializing this or putting into some sort of binary extension, so I don’t think the GPL would really benefit me. Nothing is set in stone. There is still time to make that decision.


May 12th, 2008 |

Tags: personal, seneca, tinderbox




Enjoying a bit of sight seeing

personal No Comments »

This weekend, some of the interns (Gary and Armen) did sightseeing throughout the California area and clocked in almost 170miles. Almost all the places I went to I already went last year, except Watsonville which is some obscure town south of Santa Cruz. We passed pigeon point lighthouse again, but didn’t stop by. Here is the route we took for some beautiful scenery on Saturday :

View Larger Map
iPod much beloved once you get on the road near the trees/mountains and the reception gets very poor to listen to anything other than hee-haw music. When we arrived in Half Moon Bay, we stopped at a beach for a bit and Armen went for a dip in the freezing Pacific water. Gary managed to get a shot of Armen’s “Baywatch moment”. I was cold with a light jacket on, I can’t imagine how bad it must have been for him.
We also made a stop at a vista point to look over the landscape and seascape. It was actually quite beautiful. A few of bugs hit our windshield and made a squishy sound.

Today, on Sunday, we met with Anthony and did some San Francisco sightseeing. The route we took can’t be plotted on google maps properly because of one-way streets. But here is the best I could do :

View Larger Map
We walked around, a lot. To the point where it was shorter for Anthony to walk home rather than go back to the van with us. The trip back was long when you have to climb steep hills. We also passed the crooked road that SF seems to be famous for. We were going to go down, but the lineup of cars was huge and didn’t seem worth it.

I have pictures of both adventures. I just want to upload them to the website. Stayed tuned for pics.

Both trips were a lot of fun, but very exhausting. The exhausting part doesn’t make it feel like a weekend. We’re already planning the next weekend! So I guess I’ll have to rest on the weekdays ;)

Update 1: I upload our San Francisco pictures.


May 11th, 2008 |

Tags: intern, personal




Working on the tinderbox’n

Web 2 Comments »

I’ve been writing an extension that uses part of Tinderbox’s (56K warning) json.js file. It’s an interesting experience, since I haven’t done much work with JSON before.

At over a meg, this json file takes quite a while to load. While parsing it and playing around with it for my own purposes, I noticed a few things that I would like to see :

  • A JSON formatter refuses to touch json.js because it is too big. So I had to do one of my own (need to upload it once I pretty it up).
  • JavaScript reportedly can load compressed javascript files. It would be mighty dandy for it to load compressed json (shrinking it down to a much smaller 84KB). Maybe it can! I have not been very successful
  • Tinderboxe’s JSON output isn’t real JSON, but that has been noted and filed in bugzilla. Hmm, I wondered why an error message was being written to my console ;)
  • I haven’t yet found a (simple) way to associate a check-in with a time/person, so I can’t “blame” a burning build on anyone. It’s got to the point where I was just about to comment asking them to reopen the bug, but loading in a new json.js file I noted some things that were not in the previous file. Mainly, the last json.js file I downloaded all had ‘undef’ in one section, and this one has a few names and id so I can sorta match when they checked in.
  • There are files littered in tinderbox to a bunch of this data that json.js is supposed to replace (See Tinderbox’s README file, Other Files section). When I just started using JSON, the almost CSV file was both direct to the point and pretty much what I wanted out of the JSON file anyways. But it was still missing some things, like who checked in, the log file, the stats. And another file sorta had that information. So it was spread out. I am really hoping that json.js consolidates and really fixes this problem. But at the same time, it is also fairly complex.

Anyways, it will be all fun and worth it when this is done. At least, I’ll be using it :)


May 6th, 2008 |

Tags: json, mozilla, personal, seneca, tinderbox




  • Categories

    • addons
    • hugs
    • Living
    • personal
    • programming
    • Uncategorized
    • Web
  • Recent Posts

    • Reordering the tab key – tabcomplete
    • (Almost) Can’t touch that new music
    • Endianness, how I loathe you
    • Update
    • AES and CBC
  • Tags

    "open source" activism audio browser compatibility bug chrome editor extension fennec google chrome house html5 hugs ie intern jquery json konqueror lazy microblog microsoft mozilla music nsid opera personal prism python regina ria safari safe security seneca shaving shoes sleep stats svg tinderbox tip toronto Web wildon windows error
  • Archives

    • July 2010
    • May 2010
    • February 2010
    • December 2009
    • November 2009
    • October 2009
    • August 2009
    • July 2009
    • February 2009
    • January 2009
    • November 2008
    • October 2008
    • September 2008
    • August 2008
    • July 2008
    • June 2008
    • May 2008
    • April 2008
RSS XHTML CSS Log in
Copyright © 2010 Softcore software development All Rights Reserved
Wp Theme by i Software Reviews
Proudly Powered by Wordpress