science: cocaine-fueled Hollywood-quantum matchup?

Don’t you love those Hollywood-excess parties, where Spiros Michalakis (research professor and manager of outreach at Caltech) is doing cocaine with a bunch of industry heavyweights and remarks “I have a lot of grant money from the National Science Foundation left over due to an accounting error, let’s blow it on a big-budget short film to promote awareness of some of the more speculative aspects of quantum mechanical theory… hell let’s make TWO short films and a ‘making of’ featurette! Quantum babyyyy!! <snort> Ahhhhh”

Today I learned that 3 years ago this actually happened, starting Stephen Hawking, Paul Rudd, Zoe Saldana, Keanu Reeves, Alex Winter, … 🙃

Posted in movies, science | Leave a comment

software: NeWS as James Gosling’s best weird idea

Lex Fridman talked to James Gosling, famous for the Emacs editor and the Java language.

At 1:47:40 he says “I’ve got this weird history of doing weird stuff.” I was fortunate to be writing documentation at Sun Microsystems in the Programming Environments team when he came up with one of the best “weird ideas”: NeWS, the Network/extensible Window System. It used the PostScript language from printers enhanced with object-oriented programming, not just to draw things on your screen, but to exchange and invoke code between your program and the window system (which might be running on another computer across the network).  So instead of calling a fixed triangle drawing function to “draw two long skinny triangles with these points”, a clock program could send the definition of a drawClockHands operator to the window system, and then just send 10 42 drawClockHands to make the window system show the time at 10:42. And you could redefine drawClockHands to draw Mickey Mouse hands, or LED segments, or whatever.

NeWS was an incredible conglomeration of networking, rendering, and language ideas; phenomenal stuff in a world that was only just adopting network programming and OOP, and where program windows with rounded corners only existed on graphics supercomputers. Sun offered it to the other workstations companies, but they didn’t want Sun to control the window system as well as the file system with its NFS [*], so they cast around for an alternative and settled on the far more basic X11 window system.

[*] Sun’s Network File System became a standard on the level of FTP between networked computers, but it didn’t successfully jump onto PCs when they got networked. It was overtaken by Netware which was then destroyed by Microsoft’s Windows for Workgroups.

Posted in software | Leave a comment

software: video fixing is possible, but not easy

In your phone’s Google search bar, search for certain animal names, then tap View in 3D, then tap View in your Space. You can even take a video as you move around (and it occasionally lashes its tail). VR is so passé, AR (Augmented Reality) is kewl.

Then spend 10 minutes trying unsuccessfully to turn off the permissions you had to give Google Search to access your camera and microphone (“Let’s film you while searching and analyze your facial expression to see how frustrated you are with Google”, what’s the harm?), and remember to curse Google for discontinuing gems like Chromecast Audio and Google Play Music while screwing around with stuff like this.

Then try to clean up and share the video, and the real fun starts…

Video processing by random walk (ultra-nerd alert!)

My camera was confused filming downward, so the original video had the wrong orientation. You can realign all the pixels to the correct orientation, but it’s even simpler: just change the video’s metadata to indicate that it should be displayed rotated. Linux media processing tools such as VLC and ffmpeg have accrued literally hundreds of options to modify video and audio streams, and I found an incantation to change the metadata:

% ffmpeg -i original_video.mp4 \
        -c copy -metadata:s:v:0 rotate=90 \
    alligator_and_dogs.mp4

Next problem: Android told my phone’s camera to take a 1920×1080 video. Most phones do not have a sensor with exactly this 16:9 ration, so normally when told to capture at a particular size they sample a part of what their sensor captures. Somehow my phone + Google’s software did this wrong, and the video wound up with black bars on the top and bottom. Ffmpeg has a video filter, cropdetect, that detects black bars and outputs a cropping rectangle, but the transition from video to black left a single line of glitched pixels at the bottom of each video frame. I could have probably fiddled with cropdetect‘s parameters to get the right output; instead I took a snapshot in VLC (press [Shift+S]), zoomed into it in a paint program, and found the top bar is 22 pixels tall and the bottom 39 pixels.

Ffmpeg has a crop filter that lets you specify how to crop the input video. But figuring out the format for it was hard. All the guides I read gave a series of ever more outlandish cropping recipes, e.g. crop=in_w/2:in_h/2:in_w/2:in_h/2 ; none of them explained that this specifies an output width and output height then a starting position in the original frame. Once I knew that I worked out that I needed to crop to the input video’s width (in_w), 61 pixels less than the input height, starting 0 pixels over, and 22 pixels down: crop=in_w:in_h-61:0:22. Clear as mud!

Facebook wouldn’t let me upload this MP4 video, because it was too brief. No problem, convert it into a GIF. I also wanted to reduce the file size. VLC’s Tools > Media information > Code said the original MP4 video’s frame rate was 48.408636, so reduce the frame rate to 1/3 of this, 16 fps. Also halve the video resolution with ffmpeg’s scale video filter to (1080 – 22 – 39)/2 = 510 tall (and -1 wide as a magic value to preserve the aspect ratio).

Put it all together and the command to make a cleaned-up small animated GIF out of the video is:

% ffmpeg -i alligator_and_dogs.mp4 \
        -r 16 \
        -vf "crop=in_w:in_h-61:0:22,scale=510:-1" \
     alligator_and_dogs.gif

I didn’t actually check if this made the right adjustments but it looked OK, so ship it. I should fiddle around with ffmpeg’s palettegen options to improve the GIF quality, but this took so much time the alligator ate my dog! 🐊🍴🐕

Posted in software | Leave a comment

music: the best video about music

There are lots of music exploration and music theory videos on YouTube. “The 7 Levels of Jazz Harmony” by Adam Neely is pure joy. Even if you don’t know your E♭maj7 from a hole in the ground, even if you hate jazz, the way he builds on a simple short pop phrase is musical, funny, inspiring, weird. I’ve watched it 7 times, and Lizzo’s “Juice” and the Scoville scale for hot peppers have permanently fused in my brain.

I joined Patreon just to reward Adam Neely for this achievement.
🎼👂🧠❤️ 😍!

If you like this sort of thing, Rick Beato has a great series “What Makes this Song Great” where he breaks down great rock and pop songs track-by-track and moment-by-moment to identify the elements of the composition, production, and performance that make it great. If for example you’ve ever wondered why “Every Little Thing She Does is Magic” by the Police is so appealing despite some cheesy elements, his breakdown is gold.

Posted in music | Leave a comment

skiing: rockers forever

I can tell I’m missing skiing in my bones when I start rolling my ankles on edge when standing still. By mid-summer I miss everything about it, even the frozen fingers, the end-of-day ache overcome for just one more run, … So I’ll relive skiing with delayed blogging about it.

I’m no longer a part-time resident of the-ski-area-soon-to-formerly-be-known-as Squaw Valley USA, instead taking trains to various ski areas. So we no longer own skis; instead we rent performance skis at the resort. In theory this lets me do massive ski evaluations, swapping skis throughout the day to find the perfect ski, as I did when I found my front-side skis.

Let’s rocker

tip of a rocker-style ski
Clown shoe tip!

I knew I wanted to try “rocker” skis. You want a long fat ski with a lot of area to lift you out of deep snow, but a long ski is less maneuverable in bumps and a fat ski is less willing to go on edge and carve. So, just curve the tip and tail up, so that on packed snow they’re flapping in the breeze and you’re effectively riding a shorter ski. The immortal Shane McConkey came up with Volant Spatulas that had reverse camber (so the center of the ski touches the snow), more like a waterski, and then improved the design with K2 Pontoons. Rossignol I think was one of the first to combine the usual camber underfoot (so the center of the ski is off the snow until you weight it) with tips curved up and out of the snow and an odd sidecut. The term of art for this is “rocker.”

The playful Rossignol Soul 7 HD

Several friends swore by the Rossignol S7 when it first came out, and Rossignol has been refining the design for over a decade into the Sky 7, Soul 7, Super 7, Soul 7 HD, … so Rossignol was the first ski I rented a few seasons ago. Even the Soul 7 has been through multiple iterations:

Three generations of the Rossignol Soul 7 (HD)
Evolution of the weird sidecut. But it works! (from the great BLISTER review)

The Soul 7 HD is fantastic. It’s playful, so willing to make different turn shapes. It’s fat underfoot at 106 mm, yet will still carve if you push it. So rather than endlessly swapping skis looking for perfection, my default is just rent these in a 180cm and done.

Head Kore

2018 Head Kore 93 top view

In 2019 the Head Kore series got favorable reviews and won awards, so I specifically tried to rent it. It’s also excellent. It feels more damped and stable than the Soul 7 HD, even though it’s actually lighter (less than 2kg a ski which is really light), and a little faster. In most ways it’s a better ski than the Soul 7 HD, but somehow not as inspiring.

Völkl Mantra 102

Völkl Mantra 102 details
from skiessentials.com

I traveled to Zermatt in Europe, and weirdly the Head Kore was unavailable; all the skiers were bombing down the pistes on skinny short race skis. I tried some skis I wasn’t happy with, then settled on the Völkl Mantra 102. I was dubious since my impression of Völkl’s fat skis was they’re beefy planks for charging Western USA all-mountain skiers: just get them out to the side on edge and power through big turns. I’m simply and sadly not that strong. But the Mantra has morphed into a rockered ski, and it’s pretty great: definitely faster, better edge hold, still decently maneuverable.

Posted in skiing | 3 Comments

design: a Black Lives Matter poster

It is the least I could do; downloading this PDF and printing out the first three pages is the least you can do.

I made it in the free and open source LibreOffice program, using the fonts Cantarell Extra Bold (originally designed by Dave Crossland) and Dobkin Script (by Dieter Steffman), they are free to install for such personal use. I filled in the ‘v’ in “Lives” using the free and open source Inkscape program.

2021-11 update: I put it online, why not?

Why hide my work on the 3-millionth most popular blog on the web where no one will see it? Since I’m giving it away I might as well store and version the files publicly, so I hid it among 200 million other repositories on GitHub, at https://github.com/skierpage/BlackLivesMatter_poster. I even filed a bug (called “issue” on GitHub) that the T-E-R in “MATTER” is murky and hard to read when printed out.

Open source code art

Before doing this I searched for “Black Lives Matter” and “BLM” on GitHub to see how others did it. I found one project with a few enormous photos of protests, another mysterious piece of code with no explanation of what it does, and a promising-sounding but completely empty sk88888888ordie/BLM-Posters project. I assume artists mostly share digital works on other sites such as DeviantArt (I always found that name sketchy). Artwork, even digital, isn’t really like code, so GitHub doesn’t make it easy to indicate “This project is available under a Creative Commons Attribution-ShareAlike license.”

Git Yer Crypto Art!

The NFT (non-fungible token, with extra-sparkly blockchain goodness) for this artwork is available for one meeelyun dolars. You can own a nearly meaningless unit of data that proves… that you spent a lot of money for something that might be tangentially related to a freely available digital file. Operators are standing by!

Posted in design, open source | Leave a comment

Google Play Music: 18 million songs and no respect

I signed up for Google Play Music All Access (Google marketing managers are incompetent at naming) the week it was announced, back in the good old days when Google’s motto was “Do no evil” and every month they brought exciting advances in the power of the web. For the $7.99 introductory offer you could listen to 18 million songs! Access to nearly every song changes a music fan’s life; hear something you like, identify it with Shazam, then dig as deep as you care. When Google introduced its cute Chromecast Audio puck and I could play all those songs in pretty high quality on audio equipment, the experience got even better.

When Google repeatedly extended YouTube with Red/Plus/Music/blahblah alternatives, I mostly ignored its half-assed attempts to turn music listening into random video playlist watching, but I got the premium version for free with the fantastic benefit of no YouTube commercials ever! All in all, GPMAA is the greatest $107.88 a year I spend.

But 18 million done badly is not everything

Except…. it isn’t access to everything. I knew Prince aka The Artist Formerly Known as Prince had a love/hate relationship with digital music and streaming, so I expected his catalog might be less available, along with other streaming holdouts like Bob Seger. But the random undocumented omissions in Google Play Music All Access are intermittently infuriating.

example: Unforgettable, but album amnesia

The first time I realized how bad it is was when I was looking for Nat King Cole and found most of his albums unavailable, then tried searching for his time-travelling duet with daughter Natalie. Her album Unforgettable… with Love is available, but not the eponymous track where she duets with Dad! Fine, whatever dispute Google has over Nat King Cole’s catalog extends to this duet. But the song simply doesn’t appear in Google Play Music’s track listing for the album! Don’t f***ing lie to me about which tracks are on an album!

Here’s another example, the immortal Blues Brothers Original Soundtrack Recording. According to GPM, these 7 tracks are the entire record. There’s a hint of the problem with missing track 6 (the gospel choir singing “The Old Landmark”), but all the songs from the ending concert are gone! No Cab Calloway singing “Minnie the Moocher,” no “Sweet Home Chicago,” no “Jailhouse Rock.” It’s an 11-track album. What the hell?!

This is not a track listing of the album!

example: Andy Summers creativity castration

After listening all the way through the Police’s oeuvre (four exceptionally good albums, one short of the 5-album cutoff for eligibility for “immortal run” status), I wanted to continue with their solo careers, starting with guitarist Andy Summers (a better Edge than the Edge). I remember reading a favorable review of his album titled The Golden Wire or something, but at the time I never heard it on the radio and wasn’t about to buy it unheard (kids of today, we had it so hard before the Internet). So go to Google Play Music, search for Andy Summers, view All albums, … no indication of such an album. Read his Wikipedia article, there it is in 1989. It’s not obscure, it’s a central part of his artistic output. Don’t f***ing lie to me with a list of All albums of an artist that isn’t all albums!

It is awful that Google Play Music silently omits the songs and albums that it doesn’t have rights to sell or stream. “Our company mission is to organize the world’s information and make it universally accessible and useful.” So do it, you lazy f***ers!

(from How Google Search Works | Our Mission)

Similarly, Andy Summers’ collaboration I Advanced Masked with Robert Fripp on A&M is unavailable and unmentioned. If I know the album title and search for it, GPM shows links to YouTube videos that are probably illegal uploads by well-meaning fans, but I want to know that they collaborated and released an album. GPM’s presentation of music information is insultingly incomplete.

But no respect

When I search for a song by an artist, I expect the first result to be the song from the album on which it was released. That’s where it all began, that’s what I care about, that’s where Google provides some useful information (often it’s the opening section of the album’s Wikipedia article). Instead GPM will randomly show me the song on garbage “Best of the NNN0s” compilations, movie soundtracks, sad live bootlegs, all the artist’s greatest hits albums, karaoke versions, and cover bands. Everything but the original album! I wind up having to search Wikipedia or Discogs to find the album title, then search for that, then click the album, then find the song.

not one of these is a studio album by The Spinners! (and the original album is not in the “95 more”!)

Metadata wrong all over

Mayer Hawthorne 'Man About Town' album in GPM with conflicting year of release
Wikipedia editors know it’s a 2016 album, Google Play Music is confused.

Google frequently has the date of releases wrong. Supposedly it gets this info from the record companies, so it’s not their fault, but music web sites get this info right. Google is happy to reuse Wikipedia content about artists and albums, but it can’t be bothered to have deeper integration with sites that know more about albums.

“OK Google, what’s a botched remastering?”

Google Play Music doesn’t even pretend to care about different remasterings of albums. When you find an album, Google’s preference is to show the latest remaster it can lay its hands on, despite the disaster of the loudness war: albums remastered and remixed to sound punchier on the radio.

When there are multiple versions of an album, GPM’s presentation is poor. Often it will present two or more identical thumbnails of an album including the deluxe version or the 25th anniversary re-release, but you can’t tell which is which without visiting each album in turn. Sometimes two albums are indistinguishable.

Google Play Music is dying anyway

I’ve been meaning to moan about Google Play Music All Access misfeatures for years. I’m finally doing so as Google announces it’s killing the product. Already you can’t buy digital songs on it any more. Google will force everyone to YouTube Music, and the lamentations are disheartening. Unlike some subscribers, I think I have local copies of all the digital music files I uploaded to GPM, mostly in the 2000s when I would buy “singles” on GPM and Amazon, and artists’ web sites would offer MP3 downloads of obscure tracks. But why put up with Google’s shenanigans if there are better alternatives? Now would be a perfect opportunity to jump ship to a better music streaming service that respects musical artistry and I hope pays more than a pittance for each song I listen to. Qobuz is an obscure music streaming service that offers higher-resolution tracks (more important for better mixing than actual increased fidelity that you can hear), and it integrates with Roon‘s music playing software (another darn blog article I should write). However, it will hurt to give up ad-free YouTube video watching. Even more monthly subscription fees are in my future…

Posted in music, software | Leave a comment

music: wondering at Stevie Wonder

Play Stevie Wonder’s immortal run of albums – Music of My Mind, Talking Book, Innervisions, Fulfillingness’ First Finale, Songs in the Key of Life – and you will be repeatedly floored by his artistry and talent. What brings tears of joy are the elements I’d forgotten amongst the greats: the perfect rainy-day funk of “Tuesday Breakup,” Stevie murmuring “Do it, Jeff [Beck]” during the guitar solo on “Looking for Another Pure Love,” the burning vocals in “It Ain’t No Use,” the zOMG what did he just do chord changes in the B melody of “Please Don’t Go,” the Nokia ringtone teleported into “All Day Sucker,” …

This Slate article is emphatic: “arguably the greatest sustained run of creativity in the history of popular music.” Is it “greater” than Joni Mitchell’s run, or Elvis Costello’s first five albums, or the Beatles’ lighting the rocket engines around the release of Rubber Soul? The obvious answer is they’re incomparable in both senses of the word.

But I’ll give it a go. Stevie Wonder’s lyrics can’t compete with Joni or Elvis, they’re at best direct expressions of emotions but often convoluted without strong wordplay. Co-producers Robert Margouleff and Malcolm Cecil on the first four are deservedly famous for advancing synthesizers with their T.O.N.T.O. system and use of synthesizers for bass, strings, harmonies – everything but drums.

Rhythm, not drums

Stevie Wonder is obviously outrageously talented on keyboards, harmonica, and singing. It’s easy to overlook his drumming; he’s not deeply in the pocket, or super-heavy, or flashy. He can ride the hi-hat like a disco drummer, but his drumming doesn’t propel the song, it’s another rhythmic element subservient to musical ideas. Stevie gets to play drums and Moog bass and percussive keyboards, so no one instrument has to drive.

Songs in the Key of money

cover of Stevie Wonder's "Songs in the Key of Life"
magnum opus

I bought Innervisions and Fulfillingness’ First Finale when they came out. Re-listening, I forgot how bleak Innervisions is; Stevie Wonder moved away from love songs and heartache songs to look around, and he was distressed by what he saw under the presidency of Richard Nixon.

When Songs in the Key of Life came out as a double album with at first an additional bonus 7-inch EP, I balked. $13.98 was a lot of money! Also some low-talent British singer re-made “Isn’t She Lovely” as his own mawkish single when Stevie Wonder was unwilling to shorten the song, and BBC Radio 1 stupidly played this over and over instead of the far superior original album track. Over time I grew familiar with the towering songs, including “As” and “If it’s Magic” because friends had the double album. Listening to it on a streaming service, the additional tracks from the bonus single are a revelation. “All Day Sucker” is unlike anything Stevie Wonder did, and “Saturn” is trippy. And the amount of time and care lavished on the record is incredible:

Nonstop sessions stretched across two-and-a-half years, two coasts, and four studios: Crystal Sound in Hollywood, New York City’s Hit Factory, and the Record Plant outposts in Los Angeles and Sausalito. More often than not, he could be found in one of those spaces, sometimes for 48 hours at a time, chasing his muse with a rotating crew of engineers and support musicians. Over 130 people were involved in the recording, including Herbie Hancock, George Benson, “Sneaky Pete” Kleinow and Minnie Riperton. “If my flow is goin’, I keep on until I peak” became Wonder’s mantra.

Inside Stevie Wonder’s Epic ‘Songs in the Key of Life’

Although there are exceptionally talented songwriters and musicians today, same as it ever was… we shall never see its like again.

Posted in music | Leave a comment

music: Trevor Horn and the Buggles in 1979

I revere producers as much as musicians and songwriters. I was dimly aware of producers, starting with the mysterious “produced by Bones Howe” in big letters on the back of some record… I thought it was the Carpenters but now I can’t find it. What really piqued my interest was Chic’s in-your-face credit on most of their early albums:

Composed, produced, arranged, conducted, and performed by Nile Rodgers and Bernard Edwards for the CHIC Organization, Ltd.

which lead me to follow all the records that Nile and ‘Nard produced. It’s a joy to revisit a classic song, check the credits, and realize “Wait, that’s yet another great song produced by…” such as unheralded Alan Tarney (who liked his own song “Once in a While” so much he produced it on three different records) or the almighty Arif Mardin. Then you can lose yourself in Wikipedia and Discogs finding all their production credits.

And so to Trevor Horn, the bass player, singer, video (killed the radio) star, and maximalist producer. As a producer he’s probably most famous for his work with Frankie Goes to Hollywood (1984 strikes again!), and my favorite, his spectacular production for ABC’s The Lexicon of Love (when producer full of ideas meets hungry band really going for it, and the magic happens). The guitars on the latter sound 10 feet tall on a great stereo.

But there’s a lot of prehistory to Trevor Horn. Listening to the deluxe reissue of the legendary Dusty in Memphis by the great British pop-soul singer Dusty Springfield reminded me of her unexpectedly solid dance single that I bought on vinyl years ago… turns out it was written by Trevor Horn, his partner in Buggles Geoff Downes, and Bruce Woolley.

1979 was certainly Buggles’ annus mirabilis. They also made the “Caribbean Air Control” single as Big A, made the impressively weird space disco-pop “Star to Star” album as Chrome, then released “Video Killed the Radio Star” first by Bruce Wooley & the Camera Club and then it broke big as Buggles. Then only 10 days into 1980 Buggles released “The Age of Plastic” album. That’s quite something, it’s about as productive as Nile Rodgers & Bernard Edwards of Chic were at the same time.

Posted in music | Leave a comment

computers generating art

First came text

GPT-2 and other AI (Artificial Intelligence) Transformer models are impressive. After you hand the AI an enormous corpus of text, you give it some text and it continues, generating character after character that make up recognizable, even creative, language maintaining the style. Here’s how https://talktotransformer.com/, an online GPT-2 system, continued when I fed it the words in bold:

“The AI ruined my life” muttered Denise.

“You are the target of someone else’s revenge” replied Cale.

“And you did nothing to prevent it”

“Your words…”

“Yes, that is what I say” smiled Cale.

“So the boys are more than just a valuable asset”

“Yeah”

“But I have things to tell you guys now”

“Huh?”

Cale looked at it. “If we were to adopt our normal lives, we wouldn’t be able to afford everything we want in this world. So for us, we must keep working. It’s this effect of working hard that we were able to obtain everything, even the rare

It’s not simply regurgitating matching phrases that it has ingested. In compressing down the 40 gigabytes of text it was fed into some kind of internal representation in its neural network of only 1.5 billion parameters, it has noticed certain features of what you give it and they trigger… something… in its neural network that riffs on what you provided. Give it the same starting text again and it goes off in a different direction.


“The AI ruined my life” muttered Denise.

“Ha! You’re quite an interesting girl. Like a puppet.”

A sorcerer suddenly appeared next to the table and pointed his wand at Denise, saying “Use your magic!”

A short spell field with dazzling golden lights appeared above Denise’s head and soon a single tablet appeared on the table and began to write. Then the tablet turned around and charged towards the sky with a radiant light.

The tablet charged forwards quickly!

The intensity of the light was frightening. Once it passed the boundary of the spell field, it became all-encompassing. The tablet was so bright that the color seemed to change from a golden to a blue, and not only did it look beautiful,

And again:

“The AI ruined my life” muttered Denise. “I looked over at Bobby, who was sitting with his back against the wall and moving his head side to side. He said nothing. His mouth was hanging open. I then turned back to the door and said, “What the hell is going on?” The door opened and I was met by my father, who had been in hiding for about three weeks and was waiting for me at the door. Bobby was still sitting there, looking over at his father. But in this instance the looks were no longer new. Instead, they had become familiar, like a sick man watching a recently emerged corpse of his loved one. His face turned beet red and he said, “You didn’t make my greatest

Transformer models don’t just continue writing in almost any style. With additional fine-tuning they can recognize a question and answer format, or a simple math problem, or a multiple choice quiz, or a request to summarize, … and continue with the answer to the problem better than most humans. And the newest GPT-3 (eleventy billion parameters in the model! fed a trillion words! gargantuan PDF paper!) can do all these without any fine-tuning! It’s ingested so much text that if you give it one or a few examples of what you want it will figure out what you’re asking for, just as a kid can participate in a made-up game without having to go to classes in that game. The following interaction, getting it to use a made-up word, is amazing to me:

To do a “farduddle” means to jump up and down really fast. An example of a sentence that uses the word farduddle is:
One day when I was playing tag with my little sister, she got really excited and she started doing these crazy farduddles.

It’s “merely” responding to input, but be honest, that’s all you’re doing when someone asks “How are you?” or “What day is it?”

It’s been my hope for decades (my thoughts in 2006, 2010) that some AI would gain enough smarts to understand language, then overnight it would ingest every document on the Internet and be the smartest thing in the world. Instead, AI researchers force-feed a huge subset of the Internet into a language model and it “does language” extremely well without understanding what it’s doing or what it all means.

OK so music…

You can apply a similar approach to music. Train a transformer on the musical note instructions in MIDI files, and then give it some starting parameters, and it can generate further musical instructions. OpenAI built such a system, called MuseNet. Here is what transpired when world-unfamous producer skierpage told MuseNet to improvise in the style of Disney starting from Beethoven’s Für Elise. The piano continues well enough, but then the meth-addled drummer comes in from another planet and goes nuts, and then it ends with a piano flourish. I can’t imagine a human being coming up with this.


OpenAI has now moved on to generating actual waveforms of music with its new system, called Jukebox. I think the main motivation is it can generate someone singing lyrics as well as the instrumental performances. This is crazy. It learns to compress digital music files at 44,000 samples a second down to a much smaller compressed representation that only it understands, and then if you ask for music in some style it creates music in that compressed representation and “blows it back up” into millions of samples making up a musical waveform.

Here’s “Rock, in the style of Elvis Presley.”

It’s weird, like a broken radio tuning into a performance by a rock and roll garage band in love with Elvis but they only heard his songs on their own broken radio. And the AI has learned that Elvis was frequently interrupted by crowd noises and cheering, so after a while it throws that in too.

The lyrics on this are confusing, OpenAI says “All the lyrics below have been co-written by a language model and OpenAI researchers.” But if you want crazy lyrics, someone found Jukebox’s continuations of Rick Astley’s legendary meme song. Jukebox mostly trundles along in that inimitable 80s Stock-Aitken-Waterman style, sometimes adding some novel production ideas or a keyboard solo just like the original producers would mete out new ideas while sticking to the format. But its muffled lyrics include at 1:37 “you wouldn’t get this spaghetti on a guy… Stretch my 🍆😂. Later Rickbot goes bleak: 1:53 “Kiss the boat Denny I’m Satan’s pirate arrr”, and 3:53 “You know the rules and so you have to die.”

https://www.youtube.com/watch?v=iJgNpm8cTE8

To stress the same point as the text generator, the AI isn’t simply pasting in bits of music that it has stored matching the starting music. Instead it is calling on the… vague memories/regularities/something… that it has gleaned from ingesting “1.2 million songs (600,000 of which are in English), paired with the corresponding lyrics and metadata from LyricWiki” to produce something new yet familiar. Go browse, for example they asked it for Frank Sinatra and Ella Fitzgerald in front of a small orchestra.

… and why not images

I’ve tried to write this blog post a few times, only to have OpenAI apply transformer AI to a new area. Just today, OpenAI announced a new paper wherein it gets another transformer AI to complete an image. Same idea: give the AI millions of images, don’t tell it anything, then give it the top half of an image and it will produce one pixel value after another that continue the image. Look at it get the cat joke just from a sliver of paper visible in the input (the left column is its input, the right column is the original complete image, the middle four columns are its continuations).

So what does it mean?

These things are crazily impressive. It is rank speciesism to say “That’s not intelligent! It’s just doing something it’s been programmed taught trained fed so much data it recognizes what it should do,” especially when the format of its output is far beyond human capacity – you’ve been trained for years on Real Life but aren’t able to generate the sound of a band and Elvis Presley, or the pixels of a photorealistic image. These AIs are intelligent! And yet… they can’t maintain the plot or a musical idea over the entirety of a short story or a song. So what is it that we do when we create? Somehow we have an outline for the overall structure of the artwork, and fill in along its lines. I’m no expert, but it seems that creativity may be easier to implement than a general intelligence which can deal in concepts and know what words mean. None of these AIs can talk about their work. We can’t ask “What do you find hard? What do you enjoy? What were you aiming for when you went off on that tangent?” They’re sui generis, but the closest analogy seems to be idiots savants.

Posted in AI, art, music | Leave a comment