music: how to contribute scanned lyrics to the web

Lyrics are everywhere on the web, yet I regularly come across popular songs whose lyrics are nowhere to be found. Sometimes I have a CD or LP on the shelf with the missing lyrics printed in it! Time to make a little more of the sum of human knowledge available… Here are my notes on the process.

Where to contribute lyrics?

Many sites let users contribute and update lyrics. Ideally there would be a non-commercial user-supported über repository of lyrics, but if there is I can’t find it. All lyrics sites seem to be ad-supported (I don’t see the ads because I use the uBlock Origin ad-blocker). The worst are the sites which optimize their pages to fake out Google search so they show up high in search results for e.g. “Linx You’re Lying lyrics,” but when you visit them the page’s only content is just “Be the first to contribute the missing lyrics of You’re Lying by Linx! Kthxbye.

LyricWiki? (No)

The obvious contender is It uses the same underlying MediaWiki software as Wikipedia, but it’s on the ad-supported Wikia platform that Wikipedia founder Jimmy Wales created. It has a genuine community trying to do a good job. I made many cleanup edits and added a few songs in 2008-2011. The problem with Lyric Wiki is nothing is created for you, you really have to create each page after page with wiki text. So (using the example of adding the lyrics of Max Tundra’s Mastered by Guy at The Exchange album): first you have to add a bit of fiddly markup to the band’s page for the album:

==[[Max Tundra:Mastered By Guy At The Exchange (2002)|Mastered by Guy at The Exchange (2002)]]==
 {{Album Art|Max Tundra - Mastered By Guy at the Exchange.jpg|Mastered by Guy at The Exchange}}
# '''[[Max Tundra:Merman|Merman]]'''
# '''[[Max Tundra:Mbgate|Mbgate]]'''

then you have to create the album’s page with more fiddly markup listing each song all over again:

 |artist    = Max Tundra
 |album     = Mastered by Guy at The Exchange
 |genre     = Electronic
# '''[[Max Tundra:Merman|Merman]]'''
# '''[[Max Tundra:Mbgate|Mbgate]]''

then you have to create a page for each song that points back to the album with even more fiddly markup, and then you provide the actual value, the lyrics themselves:

 |song     = Merman
 |artist   = Max Tundra
 |album1   = Max Tundra:Mastered By Guy At The Exchange (2002)
 |language = English
 |star     = Bronze
I'm feeling flirty
Must be you heard me
My knee is hurty

Even if you’re fluent in MediaWiki markup and templates, it is pointless error-prone duplication to keep repeating the artist, album, and track name on every page. Instead, adding a lyric should be a single database action that automatically adds the song to the artist’s page and the album’s page.

So Genius!

Genius came out of annotating rap lyrics. It has a nice interface for adding song lyrics to albums, a solid community, and lets people comment on songs and individual lines. So I went there.

Scanning and converting to text

On my all-in-one printer I scanned the record sleeves and CD booklets with the lyrics at high resolution and saved them as PDFs. Then I used gImageReader-qt5 for Linux to do optical character recognition. This works impressively well! It handled blue on pink text, it automatically identifies each block of text. Then delete the blocks you don’t want recognized, such as image captions and “Thanks to Kev and Fender guitars”. Then trigger OCR and it gives you a big chunk of recognized text.

Case conversion

Some lyrics that I scanned were printed entirely in UPPER CASE. There are many ways to convert case, but the wrinkle is I want the first sentence of each line to remain capitalized; also a bit of smarts about proper names, the word “I”, and such would be nice. I found the web page does the right thing in its Sentence case mode; it saved me hacking my own tool. The other nice thing about web-based converters is the textarea with the converted text is in the browser, and Firefox highlights many misspellings due to mis-recognition, such as “allbi” instead of “alibi.”


Genius wants simple ASCII for lyrics: simple quotation marks, hyphens not em-dashes, no ligatures like fi, etc. Unfortunately gImageReader doesn’t have an option to only output simple ASCII. To find the problematic characters, I used this command line to search for any character that isn’t ASCII.

rg ‘\P{ascii}’ *lyrics.txt

(rg is ripgrep, a better text search program than the venerable grep.)

My IQ goes up, the kudos roll in

To keep us unpaid suckers working, Genius has gamified (horrible word) contributions in the form of “IQ points.” When you add a wanted song, you get points. When you identify the song parts (verse, chorus, bridge, etc.) you get more points. More points give you more rights – I can now add a new song and edit a track list, but I still can’t add an entire new album or state that Peter Martin is commonly known as Sketch.

One of the problems I had with the lyrics for the band Corduroy is Genius already listed other songs by “Corduroy” that are by a Korean singer 코듀로이 (which translates to Corduroy) and a wannabe band that reused the name. Renaming artists is very tricky and way above my IQ level, but the forum participants are very helpful. “I have to say I am really impressed with the research you have done here. I will disambiguate the artists to fix this.” Awww.

(Elsewhere I blogged about the semantic confusion of translated band names matching other bands, names containing other bands, and straight up multiple bands with the same name.)

Posted in music, web | 2 Comments

art: Jhane Barnes inspires

I’ll pull out a Jhane Barnes shirt I haven’t worn in a while and I’m Ricky Fitts in American Beauty: “I need to remember… Sometimes there’s so much beauty in the world, I feel like I can’t take it.”

Jhane Barnes shirt, fabric woven in Japan

(This and black pants, perfect)

Posted in art | Tagged | Leave a comment

web: when it comes to paying, backward USA! backward USA!

Someone asked me to donate to the Center for Youth Wellness, a worthy cause. Give Lively, a free fundraising system, organizes the donations at . The donation page’s “Suggested donation method” is “Donate by bank account” because “We get more from your donation when you pay via your bank account.”  The web page draws my bank’s logo and asks me to enter my bank username and password, but I am not on my bank’s web site.

WTF? y’all have got to be kidding me!

What’s actually happening is a third[*] company called (never heard of ’em, some tech bro fintech startup), actually displays the form, technically within an <IFRAME> in GiveLively’s page. I don’t care what security promises makes, they are completely insane if they think asking for my bank username and password on their (nested) web site is acceptable. Basic web security: never ever enter your username and password for one web site on another web site. If the browser’s location field doesn’t display with a padlock icon, don’t do it! But much the same way every company says don’t trust links in our name that go to other web sites, until they send you a survey or ad that links to and hope you ignore their own advice, somehow it’s OK to fake customers out because it’s for a worthy cause.

It’s nobody’s fault, though Plaid sure has some chutzpah. Neither the worthy charity nor I want PayPal and some credit card company delaying funds and skimming off money from my charitable donation. Give Lively doesn’t have the in-house expertise to organize a bank transfer so they hand it off to Plaid. Plaid undoubtedly got frustrated trying to organize bank transfer with every stodgy bank under the sun, so they decided to present like my bank, ask for my login, and then order a low-cost Electronic Funds Transfer by impersonating  me.

But what makes no sense is why can’t I give my bank the same transfer instructions, the ones Plaid wants to make by impersonating me? Well, if I could then there’s less need for Plaid to be working away in the background. In every other developed country, you just log in to your bank and tell it to give money to any person or organization and It Just Works without any third parties or handing out your credit card details to strangers. I’m not sure why the USA makes it so complicated, to no one’s benefit but middlemen. There’s PayPal’s Venmo (big fees) and more banks are supporting Zelle but it feels that the USA is a decade behind.

Obviously I’m not the only person freaked out by this, see e.g. this Hacker News thread. To its credit Plaid has an open bug tracker on GitHub in which issue 68 is this “privacy/security concerns.” Huffington Post has an entire article about whom you should trust with your banking sign-on. That article says “[Plaid is] a system used by most personal finance apps, like Venmo, Robinhood and Acorns. Plaid, in turn, is trusted by a long list of banks and credit unions.” GiveLively responded on Twitter “Plaid is a secure & trusted industry-leading service that allows donations via bank account. Your bank is on the Plaid platform—like most US-based banks—because it trusts Plaid.” But I don’t see any indication on my bank’s site that it actually trusts Plaid. I just have to hope that if my bank didn’t trust Plaid (or more likely stopped trusting Plaid), it would revoke whatever API authorization it gave Plaid so transfers would fail. And again, a company using a third party that uses my credentials to ask my bank to do some transaction for me is completely back-assward . It only works this way because everyone involved is too ^%$#@! lazy to do the right thing which is: I go to my bank, authenticate myself, and tell my bank to give someone money using the bank details they gave me.

One other thing: Give Lively/Plaid’s interface defaults to making a monthly donation. If you don’t pay attention Plaid will be taking money out of your account forever even if that wasn’t your intent. Because you are not in the driver’s seat, the organization wanting your money is, and their temptation is overwhelming to tweak the system to maximize the amount you donate, including defaulting to monthly giving.

[*] Update: I was wrong, there’s a fourth company. Give Lively uses Stripe, Stripe takes 0.8%, Stripe uses Plaid. Crazy.

Posted in web | Leave a comment

software: how electronic medical records could be better

On an Ars Technica story on AI in hospitals, “goofazoid” commented

There are so many things that have set this environment up, mostly having to do with hospitals attempting to save money (nursing is usually the largest expense in a budget) by having fewer nurses. This has been compounded by data systems (like CHCS/Alta, Meditech) that are not user friendly and take longer to document in than a paper record would, while having lower fidelity than a paper record would.
I think that every nursing unit should probably have at least 2 more nurses/shift, and if you want really good documentation… switch to a tablet type device that can do many time saving things:
-use a smartcard and pin to log in
-use the camera to scan the pt armband so that you don’t have to search for your pt’s records
-use the camera to scan medication bar codes to document
-use the camera to scan blood products bar codes for safety and documentation (either two nurses with two different devices would scan to complete the documentation or the second nurse could scan their badge and type a pin)
-sync vital signs from the device measuring them to the chart
-gives lab results as soon as they are available, and uses the minimum number of alerts to minimize alarm fatigue
-uses the NFC to do things like program IV pumps (directly from the orders), collect amount infused, etc
-has a low power laser range finder to be used in conjunction with the camera to take pictures of wounds, dressings, drainage etc. (the range finder allows a 1x1mm grid to be accurately superimposed on the image)
-has the ability for nurses to use a BT headset to dictate notes rather than typing
on top of all of that, there needs to be a requirement that physicians use a system that is similar- NO F***ING PAPER! If orders are written on paper they must be transcribed and errors can then occur; same goes for phone orders. There is no reason that the MD/NP can’t put the orders in from a handheld device, even when offsite.
-have it set up so that if there is a lab out of range, it messages the provider who can then enter an order (triggering an alert for the nurse)
-have it set up so that nurses can text providers about issues that need addressed less than urgently
-allow diagnostic images to be viewed (EKG, x-ray, CT scan, MRI, Echo-cardiograms, sonograms etc)
The whole concept is to make the system user friendly, interconnected and safe…

My response:

Why aren’t Electronic Medical Records companies throwing money and resources at you to make all this happen?!!!

I recently read the great Atul Gowande’s Why Doctors Hate their Computers and it is so depressing how far these systems are from helping doctors do a better job: zillions of automatic alerts that everyone ignores, people Select all – Copy – Paste entire reports into fields instead of writing a summary, the choice to spend precious time with a patient staring at a screen or to spend hours at home doing data entry, … (He concedes the systems are benefiting patients who review their records.)

Here’s my idea. instead of every data entry field being a chore it should be a just-in-time avenue for understanding. If it’s a multiple choice, every previous entry in that field should be shown in a cloud showing the history and most common ones for this patient; if it’s a number, show a sparkline graph of previous readings that highlights diversions from typical results. Etc. It’s stupid to rely on doctors reviewing previous records, instead in real-time the systems should be showing trends and alerting non-standard data as people enter it.

And obviously, blockchain!

Posted in software | Leave a comment

music: Look Now, a great Elvis Costello album

Slate’s Carl Wilson said it well “Elvis Costello’s New Album Is His Best This Century”

Yes it is. Look Now has the horns and backing vocals of Punch the Clock, more Bacharach collaborations like Painted from Memory, varied production that echoes (faintly) Imperial Bedroom. Those are great albums, so approaching their heights is excellence.

After the fed-up nihilism of 2008’s Momofuku (definitely missing a ‘c’) in a music biz where a popular album sells 15,000 copies and is guaranteed not to make back its production costs unless you recorded it in your bedroom, fans assumed we’d forever have to settle for hit ‘n’ run collaborations from Elvis. This is a bloody miracle!

Posted in music | Leave a comment

music: Corduroy transcends Acid Jazz

I somehow found out about Acid Jazz in 1992 entirely through a few compilation CDs.

One of them (100% Acid Jazz) contained an effervescent funky pop song called “Mini” by Corduroy, about… the Mini car.

I like the way your eyes light up in the dark, I like the way it don’t take much for you to start. Mini! No one takes me further than you

(The song cries out for a music video,and there was one, but sadly the Internet only has a partial poor quality capture.)

Who is Corduroy? What else did they record? At the time the internet couldn’t tell me, and they had no albums in my local record stores.

Friends’ adventures with their Mini reminded me of the song, and now of course obscure bands aren’t so obscure. It turns out Corduroy is a strong English band with an intriguing mix of retro soundtrack style and acid jazz chops. “Mini” is from the 1994 album “Out of Here” on the Acid Jazz label (talk about pigeonholing!). Other tracks are also strong, but some are so tight they’re airless. The title song is a gentle getaway song with a perfect piano solo from Scott Addison.The End of the Rainbow” has a killer intro they smartly repeat for the bridge, then lays out into a weak funky jam. “January Woman” borrows the opening chords and guitar sound of Steely Dan’s “Green Earrings” off The Royal Scam to fine effect.

The next album, 1997’s “The New You!“, is even better. Less constrained by tight funkiness and with better songs. Most songs start with an atmospheric intro then head in a different musical direction. Their lyrics still aren’t the best, but the songs have allusive phrases – “Season of the rich”, “This is supercrime and it happens all the time” (a song about trying to get a refund for a broken hi-fi!), “The hand the rocks the cradle rules the world,” “Tomorrow you will be a designosaur,” “Be an evolver!” etc. The instrumentals are good too, “Data 70” sounds like a 1970 caper movie soundtrack, ending with the same explosive riff as the Mission Impossible theme song, all bongos and horns wailing. In particular “Fisherman’s Wharf” is fantastic, a loving homage to Mike Post’s cop theme songs “Hill Street Blues” and “The Rockford Files.” It has police radio chat and sirens to start, seagulls near the beach, and even a faux NBC logo sound at the end. But it’s not kitsch, it’s just great.

The band returned after a 17-year break with the wonderful title “Return of the Fabric Four”, more of the same but not as magic.

So the one hit wasn’t just a flash in the pan. This makes me want to redouble my efforts to track down other B-sides and mystery artists, like Cooly’s Hot Box from another acid jazz compilation, Giant Steps Volume One, and the legendary Radio Arabesque.

Posted in music | Leave a comment

Minox James Bond off-hours camera

Minox 35GL and case

A mini-classic

Minox was moderately famous for making a tiny spy camera using 16mm film. But they also made the smallest 35mm camera in the world (100 x 61 x 31mm or just 2.4″ tall), and I owned several as each broke or Minox made small improvements.

The original flash wasn’t too bad, but it ate through batteries. The second-generation flash took four at a time and is bigger than the camera! So to carry the pair with me I used a purse (!). LEDs are an incredible breakthrough in so many ways.

Minox 35 GL and various flashes

So L.A. with all the flash

Posted in design | Leave a comment

Switched from Monkeybrains to Dreamhost

Meet the new host, much like the old host.

Alex and Rudy at Monkeybrains are great folk, I was happy to give them money to handle e-mail and host this site and some others (but not, sigh, lost to domain squatters). Monkeybrains is clearly less interested these days in being an ISP to small fry. Years ago they told me I would be better off paying them for a small VPS (virtual private server, basically your own computer in the cloud), but when it works shared hosting saves you some admin hassles.

sftp (secure file transfer) broke, at which point I gave up. I was already administering my WordPress site in the clear without https, but transferring files without a password was just too insecure

Anyway, Dreamhost shared unlimited hosting seems a good deal, so I signed up. Their in-house control panel is a bit funky but I can figure it out. It took a while to make the transition. Here are the cleaned-up steps, I wasn’t this organized.

  1. wget our sites to pull down all the web content that is reachable starting at the top (i.e. what Google does when it crawls, or “spiders,” a web site).
  2. Delete all the retrieved “files” that weren’t static content, such as WordPress blog posts, generated RSS feeds, directory listings, etc.
  3. sftp the static images that WordPress manages from /wordpress/wp-content/uploads/, plus a few other files that were on the site without being linked to.
  4. This still left hundreds of files that were on the old web sites that aren’t on the new host – e-mail me if you’ve lost access to some beloved item.
  5. sftp all that static content (I used KDE’s fine Krusader split-pane file explorer) up to Dreamhost.
  6. Admire it in temporary mirror sites.
  7. Move all our e-mail off old IMAP server to local folder (pretty good instructions).
  8. Change our domain’s DNS records to point to Dreamhost. At this point e-mail to our domains and requests our web sites flowed to Dreamhost.
  9. Enable https using free Let’s Encrypt certificates (yay!).
  10. Re-add new/old e-mail accounts now at Dreamhost.
  11. Futz around with /etc/hosts so I could access both old WordPress site and new one simultaneously.
  12. Run WordPress export from old site.
  13. Realize that Dreamhost puts all the WordPress admin files in the root of your web site, which isn’t ideal and doesn’t match my old site, so move all those files into a /wordpress directory (instructions).
  14. Tweak index.php and .htaccess to reflect the WordPress reorganization.
  15. Recreate the skierpage user on new WordPress site, install a plug-in and the Twenty Ten theme I was using on the old site.
  16. Import the XML file of the old site.
  17. Write this!
Posted in web | Leave a comment

music: compare and contrast

The most excellent Brandon Harris asked his followers for musical “must pick one” binary choices. My thoughts (I refuse to pick!):

Johnny Cash or Elvis Presley, Michael Jackson or Prince

Both choices are earthshaking performer vs. musical creator.

MJ delivered two faultless albums and many great songs, but he never made anything so unique as e.g. Anotherloverholeinyohead.

Beatles or Rolling Stones (or anyone)

If you listen to the Beatles’ albums in sequence, around Rubber Soul they unfold wings and soar higher… and keep doing it for five albums. I think musical groups throughout the seventies struggled with “okay they’re on Mount Olympus, what the f*** do we do down here?” Rock even harder, go back to roots, go further with progressive rock, import the jazz they never did to make jazz-rock or jazz-funk, and explore performance they abandoned with glam rock, UFO stage sets, etc.

Duran Duran or Adam and the Ants

Duran Duran made more good songs, but Adam and the Ants were hugely influential in Europe. Duran Duran wanted everyone to become a fan, while Adam and the Ants explicitly made hermetic music for its tribe (“that music’s lost its taste, so try another flavor – Ant music!”, “A new royal family, a wild nobility We are the family … Antpeople are the warriors, Antmusic is the banner”). It encouraged bands to be more niche, more extreme, and many genres bloomed: new romantics, electropop, deep jazz, Northern soul (Dexy’s Midnight Runners), ska …

Posted in music | Leave a comment

disintermediation: The end game for retail, and how you can profit

It doesn’t make any sense for 50 merchants to be selling the (allegedly) same battery or memory card through Amazon. Most ship through Amazon or even commingle items, so the robot pulls your order from one shelf filled with all those merchants’ supply of the same item! The only way a seller can be much cheaper than everyone else is if it sells stolen, refurbished, or fake items.

Two things will happen.

  1. Manufacturers will sell direct through Amazon, getting rid of useless middlemen that add no value and only give them a bad reputation.
  2. Amazon Basics will expand. Last time I bought rechargeable batteries I didn’t waste an hour ignoring the 5-star reviews to read all the horror stories of dead fake repackaged batteries, I just ordered Amazon Basics and got working batteries in frustration-free packaging.

Also, there are thousands of Chinese manufacturers making good products desperate to make a name for themselves and escape the nightmare of contract manufacturing for once-mighty brands that are squeezing them to cut corners. Folks, learn Chinese and help Happy Dongfeng Best Factory market its products. “Here’s why our $6 USB charger is better than the others.”

I expressed these ideas a decade ago in my perspicacious “disintermediation” series of posts (ordering directAmazon should be/buy UPS , universal spiff). Yet Jeff Bezos still hasn’t contacted me to discuss them over a power breakfast…

Posted in web | Leave a comment