{"id":863,"date":"2019-05-26T23:40:11","date_gmt":"2019-05-27T06:40:11","guid":{"rendered":"https:\/\/www.skierpage.com\/blog\/?p=863"},"modified":"2022-08-31T17:46:50","modified_gmt":"2022-09-01T00:46:50","slug":"music-web-sundays-spent-disambiguating","status":"publish","type":"post","link":"https:\/\/www.skierpage.com\/blog\/2019\/05\/music-web-sundays-spent-disambiguating\/","title":{"rendered":"music\/web: Sundays spent disambiguating"},"content":{"rendered":"\n<div class=\"wp-block-media-text alignwide is-stacked-on-mobile\"><figure class=\"wp-block-media-text__media\"><img decoding=\"async\" src=\"\/images\/web\/google_sundays_fake_news.png\" alt=\"\"\/><\/figure><div class=\"wp-block-media-text__content\">\n<p class=\"wp-block-paragraph\">The Google Now screen on my phone does a good job presenting news relevant to me. It struck gold when it displayed &#8220;new album out now&#8221; by The Sundays, zOMG!! After 22 years, out of nowhere they deliver a new album around Harriet Wheeler&#8217;s astounding voice and David Gavurin&#8217;s chiming guitar work!<\/p>\n<\/div><\/div>\n\n\n\n<p class=\"wp-block-paragraph\">Oh noes, it&#8217;s actually an unrelated Japanese band.<\/p>\n\n\n\n<figure class=\"wp-block-image is-resized\"><img loading=\"lazy\" decoding=\"async\" src=\"\/images\/web\/sundays_home_slide02.jpg\" alt=\"SUNDAYS in 2016\" width=\"640\" height=\"312\"\/><figcaption class=\"wp-element-caption\">These fine folk are SUNDAYS\uff08\u30b5\u30f3\u30c7\u30a4\u30ba\uff09<\/figcaption><\/figure>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/www.skierpage.com\/images\/web\/the_sundays.jpg\" alt=\"press pic of The Sundays\"\/><figcaption class=\"wp-element-caption\">but I don&#8217;t think they&#8217;re The Sundays<\/figcaption><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>2nd example<\/strong>: Google Play Music, YouTube, and Genius show Korean songs by \ucf54\ub4c0\ub85c\uc774 (&#8220;corduroy&#8221;) as by the English acid jazz group Corduroy <a href=\"\/blog\/2018\/09\/music-corduroy-transcends-acid-jazz\/\">of whom I&#8217;m a big fan<\/a>. Yes her name <em>translates<\/em> as &#8220;corduroy,&#8221; but she&#8217;s a different artist!<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>3rd example<\/strong>: GPM found a sweet cover of Burt Bacharach&#8217;s light adult pop song &#8220;Knowing When to Leave,&#8221; made in 1998 by Casino, which Google and English Wikipedia agree is a rock\/alternative band from Birmingham. Crazy genre-defying work? I finally figured out that the British Casino didn&#8217;t even form until 2003 and the song is actually by an obscure Icelandic band also called &#8220;Casino&#8221; together with P\u00e1ll \u00d3skar Hj\u00e1lmt\u00fdsson. It&#8217;s part of an entire album of sincere\/camp\/tongue-in-cheek recordings of late 1960s\/early 1970s hip music that is not just in stereo, it&#8217;s called &#8220;Stereo.&#8221;<\/p>\n\n\n\n<figure class=\"wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-4-3 wp-has-aspect-ratio\"><div class=\"wp-block-embed__wrapper\">\n<iframe loading=\"lazy\" title=\"Knowing When to Leave\" width=\"640\" height=\"480\" src=\"https:\/\/www.youtube.com\/embed\/ZACPlqCkVfk?feature=oembed\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share\" referrerpolicy=\"strict-origin-when-cross-origin\" allowfullscreen><\/iframe>\n<\/div><figcaption class=\"wp-element-caption\">The band &#8220;Casino&#8221; but not THE band &#8220;Casino&#8221;<\/figcaption><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>4th example<\/strong>, then I&#8217;ll stop: Google Now then alerted me to the new album by progressive rock masters Yes, named &#8220;Chet.&#8221; Well, <a href=\"\/blog\/2013\/06\/music-steve-howe-phenomenal-guitar-work\/\">my hero Steve Howe<\/a> is a fan of Chet Atkins, so it&#8217;s possible&#8230;<\/p>\n\n\n\n<div class=\"wp-block-media-text alignwide is-stacked-on-mobile\"><figure class=\"wp-block-media-text__media\"><img decoding=\"async\" src=\"\/images\/web\/chet_yes_plis_cover.jpg\" alt=\" alt=\"\/><\/figure><div class=\"wp-block-media-text__content\">\n<p class=\"wp-block-paragraph\">Nope, it&#8217;s obviously a different band. Come on, punctuation matters! Just because the band name has a comma in it is no excuse to get it wrong. I&#8217;m going to release music by &#8220;<strong>The  Bea<\/strong>[Unicode ZERO-WIDTH NO-BREAK SPACE]<strong>tles<\/strong>&#8221; to see how many people I can  scam \ud83d\ude09<\/p>\n<\/div><\/div>\n\n\n\n<p class=\"wp-block-paragraph\"> Spotify is also confused about who made this:<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"\/images\/web\/chet_yes_plis_search.png\" alt=\"search results for 'Yes Plis Chet'...\"\/><figcaption class=\"wp-element-caption\">Comma? Ampersand? Confusion!<\/figcaption><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">And though Amazon seems to know it&#8217;s by &#8220;Yes &amp; Plis,&#8221; if you ask for more about the band you can tell Amazon is commingling the songs like a shelf of widgets in its warehouse ostensibly sold by different companies<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"\/images\/web\/chet_yes_plis_amazon.png\" alt=\"Amazon Unlimited when you click the band name for 'Chet'...\"\/><figcaption class=\"wp-element-caption\">All the classic rock plus one sore thumb<\/figcaption><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\">Get a Q!<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">I believe Google Play Music, YouTube, and these other services rely on what the music labels provide, and\/or then just do a string search. But this doesn&#8217;t work when band names are translated into English, or have weird punctuation, or the band name contains another group&#8217;s name, or the band lazily\/intentionally reuses an existing band name, &#8230;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">If only there was a vendor-neutral way to identify and disambiguate entities in the world. Of course there is, Wikidata! &#8220;The Sundays&#8221; are the entity <a href=\"https:\/\/www.wikidata.org\/wiki\/Q3122789\">Q3122789<\/a> in Wikidata that is an instance of a band, and then some person or bot added another entity <a href=\"https:\/\/www.wikidata.org\/wiki\/Q17231144\">Q17231144<\/a> that is also an instance of a band, also labeled &#8220;The Sundays&#8221; (until I edited it, see below). Same name, two different things.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">So these bands <em>can<\/em> be distinguished, but actually doing it is a hard problem as long as humans are in the loop: I don&#8217;t see Japanese press release writers writing &#8220;FOR IMMEDIATE RELEASE: The Sundays (<em>Q17231144<\/em>) release new album!&#8221; so that Google can disambiguate, nor will the people who translate that press release into English (whence I assume Google got excited on my behalf) add a note &#8220;<em>not the Q3122789 English band<\/em>.&#8221; \ud83d\ude42  Moreover, as I&#8217;ve written before, I&#8217;m convinced Google doesn&#8217;t actually want a semantic web where web pages tell computers what they mean; it wants a messy confused bunch of pages so that it can apply massive AI to this kind of disambiguation, so that only Google can provide good context-specific answers to questions like &#8220;What&#8217;s the last album from the Sundays&#8221;? (Also, the moment you make it easier for pages to say what they&#8217;re about, immediately a bunch of boner and diet pill pages will semantically identify themselves as &#8220;Latest news about Kardashian family&#8221; or whatever is a popular search term.) But then it&#8217;s frustrating to see Google itself get it wrong.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">What&#8217;s also frustrating is others fixed the Genius lyrics site to distinguish &#8220;<a href=\"https:\/\/genius.com\/artists\/Corduroy\">Corduroy<\/a>&#8220;, &#8220;<a href=\"https:\/\/genius.com\/artists\/Corduroy-band\">Corduroy. (band)<\/a>&#8221; [note the period, sic(k)!], and &#8220;<a href=\"https:\/\/genius.com\/artists\/Corduroy-korea\">Corduroy (Korea)<\/a>&#8220;, but the same cleanup has to be repeated on every data-driven web site. Q numbers to rule them all!<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Cleaning up Wikidata<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Like Wikipedia, anyone can edit Wikidata information. The Japanese Wikipedia article that seems to have generated the duplicate &#8220;The Sundays&#8221; entity in Wikidata is actually titled <a href=\"https:\/\/ja.wikipedia.org\/wiki\/SUNDAYS\">SUNDAYS<\/a>, so I changed the English label of Q17231144 to &#8220;SUNDAYS&#8221; and added the English description &#8220;Japanese rock band&#8221;; I also added the English description &#8220;1990s English alternative rock band&#8221; to Q3122789 to help avoid further errors.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">What&#8217;s odd is the Japanese band&#8217;s Wikidata page includes a bunch of identifiers for the <em>English<\/em> band in other online databases: the <a href=\"https:\/\/viaf.org\/viaf\/126826622\/\">VIAF identifier 126826622<\/a>, the <a href=\"https:\/\/catalogue.bnf.fr\/ark:\/12148\/cb13926837j\">Biblioth\u00e8que nationale de France identifier 13926837j<\/a>, the <a href=\"http:\/\/www.isni.org\/0000000110874877\">International Standard Name Identifier identifier 0000 0001 1087 4877<\/a>, <a href=\"http:\/\/id.loc.gov\/authorities\/names\/n91122952\">the Library of Congress authority ID n91122952<\/a>, etc. <em>All<\/em> of the data in these other databases seems to apply to the English band, but all were missing from the English band&#8217;s Wikidata page. I suspect some automated bot found the Japanese &#8220;The Sundays,&#8221; incorrectly linked it to the VIAF identifier for the English &#8220;The Sundays,&#8221; and that in turn prompted other bots to add all those other identifiers to the wrong band. It seems poor design that an entity that obviously conflicts with another &#8220;band named &#8216;The Sundays'&#8221; entity gets all these automated identifiers for the other thing added to it.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The Icelandic group Casino doesn&#8217;t seem to have a Wikidata page&#8230; meanwhile Wikidata already has two &#8220;Casino&#8221; instances of a band, the well-known 2000s Casino from English Wikipedia and another from <a href=\"https:\/\/nl.wikipedia.org\/wiki\/Casino_(band)\">Dutch Wikipedia<\/a> that describes a one-off British band. As with the two &#8220;Sundays&#8221;, they have overlapping external identifiers, in fact some bot mistakenly linked both of them to the Billboard artist page for an unrelated rapper who calls himself &#8220;Casino.&#8221; And on Google Play Music, the artist &#8220;Casino&#8221; identifies as the 2000s English &#8220;rock\/alternative band,&#8221; but most of the songs and tracks are clearly by black rapper(s) who adopted the moniker &#8220;Ca$ino&#8221; without caring about existing European bands.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>The Google Now screen on my phone does a good job presenting news relevant to me. It struck gold when it displayed &#8220;new album out now&#8221; by The Sundays, zOMG!! After 22 years, out of nowhere they deliver a new &hellip; <a href=\"https:\/\/www.skierpage.com\/blog\/2019\/05\/music-web-sundays-spent-disambiguating\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_crdt_document":"","footnotes":""},"categories":[13,25],"tags":[],"class_list":["post-863","post","type-post","status-publish","format-standard","hentry","category-music","category-semantic-web-web"],"_links":{"self":[{"href":"https:\/\/www.skierpage.com\/blog\/wp-json\/wp\/v2\/posts\/863","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.skierpage.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.skierpage.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.skierpage.com\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.skierpage.com\/blog\/wp-json\/wp\/v2\/comments?post=863"}],"version-history":[{"count":7,"href":"https:\/\/www.skierpage.com\/blog\/wp-json\/wp\/v2\/posts\/863\/revisions"}],"predecessor-version":[{"id":1535,"href":"https:\/\/www.skierpage.com\/blog\/wp-json\/wp\/v2\/posts\/863\/revisions\/1535"}],"wp:attachment":[{"href":"https:\/\/www.skierpage.com\/blog\/wp-json\/wp\/v2\/media?parent=863"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.skierpage.com\/blog\/wp-json\/wp\/v2\/categories?post=863"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.skierpage.com\/blog\/wp-json\/wp\/v2\/tags?post=863"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}