I’m looking to add a significant amount of Music data to the OpenRecommender project to power the new Music Recommendation services.
Since I couldn’t find one, I feel compelled to write a side-by-side comparison between MusicBrainz (the old trusty) and the Million Song Database by LabROSA @ Columbia U. (the new kid in town). The following is the breakdown, and though I’ll offer my thoughts at the end, I’ll encourage the reader decide which is the best data source.
Read the rest of this entry »
L10N is an acronym that stands for Localization (coming from the 10 letters in between “L” and “N” in the word Localization). L10N is a combination of software and language targeted to a specific region’s (or more specifically – locale’s) dialect and/or socio-cultural, politically correct, visually appealing representation. The standards and specifications most commonly used to implement or develop support for L10N is ISO 3166.
While a complete list of locales is available on the ISO’s ISO-3166 website, there are no particularly easy-to-use versions of the data except for a zipped XML file. For convenience’s sake, we’re offering an L10N SQL script in an easy-to-use 2-column format, and an HTML select drop-down version below:
An increasingly common problem on the web (not to mention Computer Science in general) is the integration of disparate data sources.
Whether those data sources are accessed via Web Services, APIs, databases, scraped-content, or even plain old flat files seems to be less and less relevant in comparison to the quick and efficient production, consumption and integration of the data itself.
For this reason, BCmoney will be launching a number of new products in 2009-2010, the early prototypes of which can be seen here:
- BC$ MobileTV – Video Repository (mashup of Freebase, IMDB, Wikipedia, YouTube, DailyMotion, Veoh, Blip.TV, MegaVideo & more…)
- BC$ MobileTV – Music Repository (mashup of Freebase, MusicBrainz, Wikipedia, MyPlaylistz, Last.FM, Blip.FM, LyricsWiki)
- BC$ MobileTV – News Portal (personalizable mashup of RSS news feeds from all over the web)
These represent a departure from the typical destination site by integrating data from various sources in dynamic, user-defined ways. Look for more continued developments in these and other aspects of the BCmoney MobileTV service in the near future!
- The Most Popular API Pairings May Surprise You (programmableweb.com)
- 700 Social Mashups: Twitter and Facebook Reign (programmableweb.com)
- Listen Up, Shazam; Hundreds Of Rivals Are About To Bloom (paidcontent.org)
- Big data and open source unlock genetic secrets (radar.oreilly.com)
- OpenText to Unify Data with an Integration Center (arnoldit.com)
- Introducing Balloons: Free multimedia overlays for bloggers (zemanta.com)
- Jagimo! The Social Mashup Re-Launch (themactrack.com)
MySQL can be incredibly useful and simplistic by times, while other times its lack of full Joining (i.e. LEFT OUTER JOIN gets evaluated the same as LEFT JOIN), and Referrential Integrity can become frustrating when you are trying to provide these functionalities within the Database Server.
(I know, I know InnoDB solves most of these problems, but still not complete)
So the problem is this:
Table 1 – ACCOUNT
Table 2 – ACCOUNT TYPE
account_type_ID account_name start_url end_url
Sparing too many details, I wanted to store a list of a user’s accounts on other popular websites all over the web, while still keeping a normalized database (could get out of hand otherwise, if you had a large number of users and crammed both accounts and account types into one table). Thus the solution is to of course normalize to two tables, one for the various users’ account names and one for the account types (i.e. Amazon, eBay, YouTube, Facebook, MySpace, etc…)
This brings about one little problem in that you’ll need to perform a JOIN to present each separate user with “their accounts” plus all the other available accounts that they have yet to create/add to your site for merging. Should be a basic JOIN, no problem right?
Wrong… good luck trying to solve that smoothly without multiple different forms, one for accounts they have added, one for accounts they haven’t added, and one for accounts that they can still possibly add. This can (and should) be reduced down to 2 screen, one for managing all their accounts, and one for displaying all the accounts they have added thus far. A far better solution… but in order to do that we need to JOIN on existing and non-existing data.
Read on for the solution…
I always find it useful to share publicly those tasks I find myself carrying out many many times, almost robotically. In fact, I’d like for there to be a worker bot (soft AI not hard, and software not hard… we don’t need a Terminator style takeover) to do my dirty work for me.
Well, until such a helpful little worker bot exists, I’ll have to settle for blogging and sharing via Social Networks to appease my data reuse and job simplification desires. Here’s some data for International Currency information that I end up using in many databases (and I’ve added to my upcoming OpenRecommender project, to shamelessly plug my projects).
To save you a dig through the source, here’s the HTML for a quick Select drop-down box:
Read the rest of this entry »
BC$ = Behavior, Content, Money
The goal of the BC$ project is to raise awareness and make changes with respect to the three pillars of information freedom - Behavior (pursuit of interests and passions), Content (sharing/exchanging ideas in various formats), Money (fairness and accessibility) - bringing to light the fact that:
1. We regularly hand over our browser histories, search histories and daily online activities to companies that want our money, or, to benefit from our use of their services with lucrative ad deals or sales of personal information.
2. We create and/or consume interesting content on their services, but we aren't adequately rewarded for our creative efforts or loyalty.
3. We pay money to be connected online (and possibly also over mobile), yet we lose both time and money by allowing companies to market to us with unsolicited advertisements, irrelevant product offers and unfairly structured service pricing plans.