Wuff: 2006-06

Friday, June 30, 2006

music: Hole - Malibu

I used to pine over the death of MTV and VH1, but YouTube provides most music videos on demand.

I have no idea if Courtney Love/Hole did anything else this good. The bassist is Melissa Auf der Maur.

Courtney Love's other contribution is that she repurposed Steve Albini's ideas to make a great anti-music biz speech.

Categories: music, Hole, Malibu, videos

Sunday, June 11, 2006

web: knowledge and semantics

The things that pass for knowledge, I can't understand

(Steely Dan, "Reeling in the Years")

I'm working a little bit with the Semantic MediaWiki project. It's already useful and I hope Wikipedia itself picks it up so you can query for [[produced by::Thomas Dolby]] and get a list of records. You can see Semantic MediaWiki in action on a test wiki. Try the San Diego page and my user page, see the summary "factbox" at the bottom of each and click the looking glass icon next to relations and attributes.

There's information out there. People can't possibly digest it all so they need help finding it. Meanwhile machines can digest it all, but don't understand it. I hope that they can meet in the middle, and Semantic MediaWiki is one of the best bridges.

I've worked on a string of so-called "Knowledge Bases", a term I've never liked.

First capturing support interactions and follow ups in Lotus Notes, and providing exports of them to customers.
Then writing simple Perl scripts to capture keywords like product and version, output them in HTML meta tags, and teach Verity and Atomz search to search on these.
Last a Kanisa knowledge base that actually got fairly smart with topic maps and nested categorizations.

All of these fall down at the point where a human has to input something useful in a field like Keywords. Many people can't give a good title, let alone a synopsis, even though they're subject experts.

From the machine understanding end, one of the most inspiring things I've ever envisioned came from the Cyc project, Doug Lenat's 21 year old project to teach a computer enough common-sense facts so it can comprehend a newspaper story. Cyc predates the World-Wide Web, and I and lots of others realized that combined with a Web crawler it could read everything online in three weeks and would then come up with amazing insights. Alas Cyc just isn't happening, "common sense" is as elusive and meaningless as everything else in hard AI.

Semantic MediaWiki is poised to bring those two ends closer. The wiki effect solves the bad input problem, because interested strangers will make editing passes to improve the semantics, just like other editing passes (it's impressive to see sets of Wikipedia pages get consistent and acquire more elaborate infoboxes and categorization over time).

The querying and exploration made possible by semantic relations helps people find relevant articles and information. Imagine trying to find cities in Europe with a population over one million. Even though Wikipedia has articles with that data, the best you can do is hope for a category of [Large European Cities] and then read each in turn, searching for the word "population" and reading the figure that comes after it. Or trying to find every current president. Semantic MediaWiki identifies the facts in articles and the relations between them so you're not guessing for a good set of search terms and then hoping the Google snippets have the answer.

The relations between articles can be exported as RDF, which after a lot of big word effort involving ontologies and predicate vocabularies and OWL is amenable to simple "reasoning", such as Berlin is located in Germany and Germany is located in Europe implies Berlin is located in Europe. That's still a long way from machine understanding, but I'm not sure anyone even knows what it would mean for a machine to understand a Web page.

Of course, any attempt to tell machines the subject of a Web page is immediately subject to abuse by fake and parasite Web sites as I've noted. I remember when you could clearly identify the subject of a Web page by adding it to the dmoz open directory project and putting KEYWORDS and DESCRIPTION META tags in the HTML head section. Now there are far richer ways to express information in Web pages, like microformats and RSS summaries, but search engines like Google seem to deliberately ignore them and stick to brute force indexing. Again, the wiki editing effect will keep this in check. Already I usually skip searching Google in favor of guessing Wikipedia article names (I have a Firefox bookmarklet http://en.wikipedia.org/wiki/%s with shortcut 'w' so I just type "w Cyc" in the browser).

Categories: SemanticWeb, search, web

Saturday, June 10, 2006

software: astride a mountain of Open Source riches

I've contributed a small amount of code to the Semantic MediaWiki project, which adds semantics to the system which runs Wikipedia and other sites.

Getting started was phenomenal. The Semantic MediaWiki code extends MediaWiki which is written in the PHP language, runs on the Apache Web server, and stores information in the MySQL database. I downloaded the code for SMW and MediaWiki, the XAMPPLITE bundle of Apache/PHP/MySQL and utilities set up to to run on Windows, the massive Eclipse software development environment, and the PHP editing extension to Eclipse. A few hours later, I'm making trivial enhancements on top of several million lines of source code. Every element in this is open source, free in price and free of restrictions so I can inspect and modify it. Their development takes place in the open on the Internet so I can search for bugs, documentation, and problems. The barrier to entry is zero!

It's like writing a poem, and not only is your typewriter free, but so is the printing press, the paper, and the ink.

Categories: software, SemanticWeb, OpenSource

Friday, June 9, 2006

computers: One Laptop per Child design

1st working prototype of the One Laptop per Child design

Read this spec to see some great innovation in the One Laptop per Child effort, the so-called "$100 laptop". It's exciting to imagine such careful hardware design and a treasure trove of open source software in hundreds of millions of kids' hands.

Will the Bill and Melinda Gates, Andy Grove, and other billionaire foundations have the backbone to endorse, promote, and bankroll a non Wintel design?

I'll forgive Mr. Jim Gettys for trashing NeWS (Sun's Network extensible Window System, by James Gosling who later created Java) back in the days of the war between NeWS vs. X window system and Sun/AT&T vs. every other workstation :-)

Categories: computers, OLPC

Thursday, June 8, 2006

web: Net Neutrality no-brainer

I pay my ISP to connect me to all of the Internet, all 400 or so protocols, all the millions of 'net addresses. My ISP is welcome to sell me a faster connection. The idea that ISP's can charge the other end of the connection additional money to get to me is completely f***ing insane. Edward Whitacre of AT&T says

"Now what they [big Internet companies] would like to do is use my pipes free, but I ain't going to let them do that because we have spent this capital and we have to have a return on it.

Hey, you lying jerk, the site at the other end of my connection already pays to connect, and likewise each is free to pay more for better connections (I've worked at companies that spend a lot of time and $$$ deciding whether to go with Akamai, Speedera, or AT&T for co-location and content delivery services).

Some argue this is no worse than supermarkets charging Coca-Cola for better placement at the end of an aisle. But a) we're talking speech here, not groceries, and b) I'm paying to enter an infinite supermarket of ideas.

Click the image above to learn how to talk to your elected representative. I spoke to a nice staffer and supposedly Nancy Pelosi supports the right thing. Here's how the subcommittee voted on net neutrality amendments. If your representative voted wrong, call them up and give them hell. Californian Republicans voted wrong :-(

Categories: web, ideas, NetNeutrality

Tuesday, June 6, 2006

art: greatest comic strip ever

Yes sir, Mr. Principal.. I'm going to give up school.. Everybody says I'm stupid anyway... I've decided to devote the rest of my life to making my dog happy.. No, it isn't such a bad idea, is it, sir? Well, maybe you should talk it over with your cat, and see what he thinks..

Peanuts: © United Feature Syndicate, Inc.

The basic setup is easy, the cloying sentiment of devoting yourself to a pet could appear in Garfield or Ziggy or any of the lame animal-centric current daily strips. But Charles Schulz goes further: the unseen principal endorses the idea (Schulz employs his usual beautiful trick of eliding the adult conversation), and it's wise Charlie Brown who counsels the adult against a rash decision. So now the strip is about dreams deferred, the pull of responsibility, what it means to do the right thing with your life. Plus the Zen zaniness of discussing it with an animal.

But this is the greatest cartoonist of the last 50 years (cartoonists esteem only George Herriman of Krazy Kat higher). His pen is always a knife and he's compelled to use it, inserting "Everybody says I'm stupid anyway..." just as the first panel ends. As I've remarked, Peanuts is bleak! Charlie Brown isn't fulfilling his dream here, he's desperate. Then you notice the hard set of Charlie Brown's mouth in the second panel where you'd expect him to be smiling, and how he's only, slightly, happy when the principal gets interested, and the triple negatives of "No, it's not such a bad idea". Note the beautifully ambiguous expressive hand position in the third panel, he's clutching his heart, kneading his fingers, at peace, holding onto an idea, all at once.

"I've decided to devote the rest of my life to making my dog happy"

Categories: art, Peanuts, CharlesSchulz, dogs

ideas: fair use of a comic strip

I just posted the greatest comic strip ever.

I could have just scanned my newspaper photocopy, but I wanted to do it legit. Displaying the entire thing is probably a copyright violation; I see that Web sites that sell original comic strip artwork only show half the strip. The United Media FAQ put me in touch with United Media licensing; I told them I wanted both a physical copy of the strip and rights to display it on my negligible-traffic Web site. $290 later, I have a large quality photostat, a huge TIFF image file, and the right to show this on my Web site for a year. (Of course the Internet Archive "Wayback Machine" makes expiration of information on the Web meaningless...)

I don't mind the expense, it feels good to do the right thing and maybe the estate of Charles Schulz gets a decent cut.

But what's odd is this strip was already part of my life, so there's any number of ways I could get the image on my site and claim "fair use".

If the strip was published in a book, then fair use would allow me to print a page of complete strips in a book review. Amazon Online Reader/Search Inside the Book already shows entire pages of the Fantagraphics' "The Complete Peanuts" books.
Or I could post a picture of my room at high resolution and invite viewers to zoom in on the framed strip (Firefox users, I recommend you enable Tools > Options > Advanced > General > Resize large images to fit in the browser window).
Or I could post a picture of me reminiscing over the newspaper, and invite viewers to zoom in.
snoopy.com features a rotating selection of Peanuts strips, so should they ever feature this strip I could do a screen capture as part of an article about the site, or point people to the Internet Archive for that day.

Wuff