Tuesday, January 22, 2008

Domain Name collecting and Behavioral Web Segmentation

Here's a cute blog and string on the Domain Name Collection topic and then a nice application (and a complete domain name related blog) of the Economics of Supply and Demand related to the same topic.

Thursday, January 10, 2008

More thoughts on Google Trends and a Wikipedia Visitor Tracker

Google Trends, introduced to the blog last year, is a current indicator of most popular search terms (presumably only on Google, right?) and then occurance of related news articles on the topic. Apparently still in beta version and with intermittent support, if the information in Wikipedia is up to date. How could the 'most viewed' or 'most emailed' news information factor into this graphic representation of a cross tab? Wouldn't a volume measure of articles or number or times a single article was viewed or emailed be a more accurate measure of public interest? Google Hot Trends is similar but from a 24 hour perspective.

Here's a neat Wikipedia tool that keeps tabs on most viewed links for any given day.

Has anyone come across other Search Engine keyword tracking...a Yahoo Trends for example? Or better yet one that consolidates most popular searches across all search engines?

Tuesday, January 8, 2008

2008 is the Year To Share Data

Facebook announced today they are joining the Dataportability Workgroup This unexpected move could help accelerate acceptance from other large social websites. APML is another data sharing protocol that is gaining some momentum. Essentially it is an attempt to define a common set of standards to allow personal Attention Profile data (bookmarks, photos, blog posts, browsing history, music interests, etc) to be transported (painlessly) to other web services and applications.

For instance you could group together items for public consumption separately from those for specific private groups and publish appropriate access for each, even though the items themselves are located in different places.

Now if only we could resolve all the issues surrounding data privacy.

Monday, January 7, 2008

The first word that comes to mind

Sometimes idle browsing on the internet nets some interesting finds...this site asks visitors to 'play', and they find themselves presented with a word and have to type in the first word that comes to mind. The most amusing results I found so far were for Clinton. It's an interesting application, if search engines deliver based on keyword and phrases, and SEO is rooted in similar methodology, might there not be an extended application for these results as well? Some kind of algorithm like the following for example:

word relation score = (number of times a word is associated by visitors* medhigh weight) + (number of times word presented for association*medlow weight) + (number of time word is 'passed'*highnegative weight) + (number of times work is 'answered'*highpositive weight) + (number of times site abandoned when visitor engaged in process til word introduced for matching* highest weight)

Then search engines might also filter out results that have a high probability of resulting in search visitor abandonment, or deliver results that might not match synonyms or key words or appear in paid ads but still be relevent to certain segments of customers.

Would be interesting data to append to web and search data, and use the new variables related for visitor segmentation. Anyone come across anything related?

Thursday, January 3, 2008

It's Your Data - You Should Be Able to Share It!

The ability to take the data that you upload on a website and use it elsewhere is a fundamental feature that is missing on many popular websites. Facebook for instance allows you to easily import all your address book contacts, but there is no means to export these same details. Yahoo Mail lets you export your e-mail messages but only if you subscribe to their premium mail package. Flickr provides lots of tools for uploading multiple photos at a time, let third-party tools are needed to copy the same photos elsewhere. There should be a middle ground that allows your stuff to be easily mixed and shared. is a new resource point that discusses the standards, tools, arguments and initiatives that are shaping this debate.

Your politics might impact your least in Venezuela

This link provides an interesting example of data collection and identification used by the government. In 2003 the Venezuelan opposition distributed petitions and collected 3 million signatures (twice), and these lists eventually found their way to publication on the internet. The opposition effort failed, but a data-merging effort resulted. The result of matching the list of supporters for Chavez's removal against a national database of citizens (including addresses, dob) is that the government has a list of citizens whose loyalty was, at least at the time of the petition, in question. Statisticians have tied household income data to the 0-1 variable indicating representation on the signature list and determined a 4% decrease in income resulted from signing the opposition papers. It would be interesting to look at global income outcomes related to voting history:incumbent party variables over time in both industrialized and developing countries.

