**Yet another AI Racism example** https://shkspr.mobi/blog/2024/09/yet-another-ai-racism-example/ Here's a good pub-quiz trivia question - which Oscar-winning Actors have appeared in Doctor Who? It's the sort of thing that you can either wrack your brains for, or construct a SPARQL Query for WikiData<a href="#fn:spq" class="footnote-ref" title="You can see the query for nominees and the subsequent results" role="doc-noteref">0</a>. I was bored and asked ChatGPT. The new [Omni model](https://openai.com/index/hello-gpt-4o/ ) claims to be faster and more accurate. But, in my experience, it's wrong more than it is right and is a bit more racist. I asked "[Which Oscar winners have appeared in episodes of Doctor Who?]( )" Here are the results: OK, first up, those are all entirely accurate! Capaldi *is* an Oscar-winner Doctor Who. Coleman the only Oscar-winning baddie. And I am happy to spend hours in the pub arguing over whether [The Curse of Fatal Death]( ) is cannon<a href="#fn:cannon" class="footnote-ref" title="It is." role="doc-noteref">1</a>. But then things get… weird. John Hurt didn't win [an honorary award in 2012]( ). He was mentioned in the [memoriam montage]( ) in 2017 Ben Kingsley was [*rumoured* to be playing Davros back in 2007]( ) - but it never happened. He did win an Oscar though. Ecclesdoc *was* in The Others. [It *did* win many awards]( ). But not a single Oscar. There isn't even an award for "Best Art Direction". Finally, this is tacked onto the end. Look, we all love Lynda Baron - and she was excellent in The Gun Slingers, Enlightenment, and Closing Time. I was surprised to find out she was in Yentl - but indeed she was! However the songwriting Oscar went to Michel Legrand and Alan & Marilyn Bergman. Not her.## [Why is this racist](#why-is-this-racist )This "AI" would rather hallucinate than acknowledge the Black actors who have been in Doctor Who. Sophie Okonedo plays [Queen Elizabeth the 10th]( ) in "The Beast Below". Not only is she "the bloody Queen, mate" - she was [nominated for Best Supporting Actress]( ) for Hotel Rwanda. She has as much right to be in the list ChatGPT provided as John Hurt. With no disrespect intended to Kingsley, Eccleston, and Baron - Sophie Okonedo is much closer to the original question than they are. This isn't a knowledge cut-off issue either, she was nominated *before* Oliva Coleman won. It's not like she's a bit-part. She's not an alien under a mountain of prosthetics. She's literally top of the credits after The Doctor and Amy! And then, there's the small matter of [Planet of the Dead]( ). It isn't a *great* episode. But it has a nice turn from Michelle Evans and Lee Evans<a href="#fn:evans" class="footnote-ref" title="No relation." role="doc-noteref">2</a>. Oh, and this guy… That's **ACTUAL FUCKING OSCAR WINNER** Daniel Kaluuya. He got a nomination for Get Out, but [won for Judas and the Black Messiah]( ) in 2021. Again, he isn't an unnamed background artist. He isn't there under his pre-fame stage name. He's an integral part of the show.## [What does this teach us?](#what-does-this-teach-us )The query I asked wasn't a matter of opinion. It isn't a controversial question. There aren't multiple sources which could be considered trustworthy. It is a simple question of facts. So why does ChatGPT fail? LLMs are *not* repositories of knowledge. They have a superficial view of the world and are unable to tell fact from speculation. They are specifically built to be confidently wrong rather than display their ignorance. And, yes, they are as biased as hell. There is no way that you can explain the exclusion of Sophie Okonedo and Daniel Kaluuya without acknowledging the massive levels of racial prejudice which are baked into either the model or its training data.<li id="fn:spq" role="doc-endnote"><p>You can see the <a href="https://w.wiki/B7C$">query</a> for nominees and the subsequent <a href="https://w.wiki/B7Cz">results</a>&nbsp;<a href="#fnref:spq" class="footnote-backref" role="doc-backlink">↩︎</a></p></li><li id="fn:cannon" role="doc-endnote"><p>It is.&nbsp;<a href="#fnref:cannon" class="footnote-backref" role="doc-backlink">↩︎</a></p></li><li id="fn:evans" role="doc-endnote"><p>No relation.&nbsp;<a href="#fnref:evans" class="footnote-backref" role="doc-backlink">↩︎</a></p></li> https://shkspr.mobi/blog/2024/09/yet-another-ai-racism-example/ #DoctorWho #racism
**Book Review: Somewhere To Be - Laurie Mather** My friend has published their first novel - and it is a *cracker!* After a calamitous accident, the Fairy realm is cut off from the mundane world. Only one trickster remains, a sprite by the name of Mainder who is now trapped on our side. All seems to be going well in his little corner of the world, until a plucky team of archaeologists start digging around the shattered ruins of the portal between worlds. It isn't a startlingly original take on a well-trodden subject; but it isn't intended to be. It's a cosy - slightly sexy - story of people whirling around each other, caught in a mystic tangle of intrigue. There are some lovely touches and clever little twists on the genre - including how to use a smartphone while trying to find your way through an enchanted forest and the perils of ethical seduction in interspecies romance. It's well paced and the frequent hops in time help flesh out the story without resorting to tedious exposition. A great debut. #BookReview #fantasy
**1,000 edits on OpenStreetMap** https://shkspr.mobi/blog/2024/05/1000-edits-on-openstreetmap/ Today was quite the accidental milestone! I've edited OpenStreetMap over a thousand times! []( ) For those who don't know, OSM (OpenStreetMap) is like the Wikipedia of maps. Anyone can go in and edit the map. This isn't a corporate-controlled space where your local knowledge is irrelevant compared to the desire for profit. You can literally go and correct any mistakes that you find, add recently built roads, remove abandoned buildings, and provide useful local information. Editing the full map is... complicated. For simple edits like changing the times of a postal collection, there are simple forms you can fill in. There's also an aerial view so you can drag and drop misplaced locations. But for anything more complicated than that, you'll need to spend some time understanding the interface. There's a friendly community who are happy to check or correct your submissions. I'll be honest, I don't use the web editor much. Instead, I use [the Android app StreetComplete]( ). It's like an endless stream of sidequests. As you travel through the world, it will ask if a shop is still open, or if the highway is lit, or how many steps there are on a bridge, or whether a playground is suitable for all children, or if restaurants serve vegetarian food, or if a bus-stop has a bench, or... the list is almost endless! I use it when I'm walking around somewhere new, or on holiday, or waiting for a bus. I used it so much that, for a short while, [I became the #1 mapper in New Zealand]( )! So get stuck in! Make mapping more equitable and more accurate. https://shkspr.mobi/blog/2024/05/1000-edits-on-openstreetmap/ #OpenStreetMap #ReDeCentralize
**Never use a URL shortening service - even if you own it** The Guardian launched its online adventures back in 1999. At some point, they started using the name "Guardian Unlimited". Hey, the dot com boom made us all do crazy things! As part of that branding, they proudly used the domain GU.com Over time, the branding faded and GU.com became a URL shortening service. Tiny URls like gu.com/abc could be printed in papers, sent via SMS, or posted on Twitter. They made [a huge fanfare about how it would help with analytics]( ). You can [read some of the history of the shortner]( ) to understand why it was created. And now, for reasons best known to themselves, The Gaurdian have stopped the service and [put GU.com up for sale](https://gu.com/ ). The starting price is TWO AND HALF MILLION DOLLARS! Look, if I had an asset that valuable and was looking at declining revenue, I'd sell it. But breaking that URl comes with a problem. I've written before about [why URl shortening is bad for users and bad for the web.]( ) I've even helped publish [government guidance]( ) about it. But all of those were based on the premise that the shortener was a 3rd party service. I never thought someone would be as daft as to switch off their own service. Here are some of the problems this sale causes. []( ) Is there a tweet somewhere of a future politician saying "I support this 100% GU.com/...."? Redirect that to something horrific and you have a potential scandal on your hand. There are [lots of academic papers with gu.com shortened links]( ). Those are all now dead. Millions of links around the web - including many [*on the Grauniad itself*]( ) - are all now broken. The Guarrdian could fix this by publishing a list of all the shortened URls. That wouldn't stop links breaking, but would make it possible for researchers to reconstruct the original destination. For decades, we've tried to remind people that "[Cool URls Don't Change]( )". We'll just have to hope that the people of the future find a way to decipher all these obsolete links. #guardian #hyperlinks #newspapers #url #web
## I've locked myself out of my digital life Imagine… Last night, lightning struck our house and burned it down. I escaped wearing only my nightclothes. In an instant, everything was vaporised. Laptop? Cinders. Phone? Ashes. Home server? A smouldering wreck. Yubikey? A charred chunk of gristle. This presents something of a problem. In order to recover my digital life, I need to be able to log in to things. This means I need to know my usernames (easy) and my passwords (hard). All my passwords are stored in a Password Manager. I *can* remember the password to that. But logging in to the manager *also* requires a 2FA code. Which is generated by my phone. The phone which now looks like this: Oh.## [Backups](#backups )I'm relatively smart and sensible. I regularly exported my TOTP secrets and saved them in an encrypted file on my cloud storage - ready to be loaded onto a new phone. But to get into my cloud, I need my password and 2FA. And even if I could convince the cloud provider to bypass that and let me in, the backup is secured with a password which is stored in - you guessed it - my Password Manager. I am in cyclic dependency hell. To get my passwords, I need my 2FA. To get my 2FA, I need my passwords. Perhaps I can use my MFA FIDO2 Key? Oh.## [Emergency Contacts](#emergency-contacts )Various services allow a user to designate an "emergency contact". Someone who can access your account *in extremis*. Who do you trust enough with the keys to your digital life? I chose my wife. The wife who lives with me in the same house. And, obviously, has just lost all her worldly possessions in a freak lightning strike. Oh.## [Recovery Codes](#recovery-codes )Most online services which have Multi-Factor Authentication, also provide "recovery codes". They are, in effect, one-time override passwords. A group of random characters which will bypass any security. Each can only be used once, and then is immediately revoked. I was clever. I hand-wrote the codes on a piece of paper (so they can't be recovered from my printer's memory!) and stored them in a fire-proof safe, secured with a key hidden under the cat's litter-box. Sadly, the fire-proof safe wasn't lightning-strike safe and is now obliterated. Along with the cat's litter-box. The cat is fine. I know… I know… I *should* have kept them in a lock-box in my local bank. The only problem is, [virtually no banks offer safe deposit boxes in the UK]( ). The one that does charges [£240 per year]( ). A small price to pay, for some, to avoid irreversible loss. But it adds up to a significant ongoing cost. But, suppose I had stored everything off-site. All I'd need to do is walk up to the bank and show some ID which proved that I was the authorised user of that box. The ID which has just been sacrificed in tribute to mighty Thor and now looks like a melted waxwork. [](https://twitter.com/swestdahl/status/1533504584328523776 ) Oh.## [Friendly Neighbourhood Storage](#friendly-neighbourhood-storage )Perhaps what I *should* have done is stored all my backup codes and recovery keys on a USB stick and then given them to a friend? There are a few problems with that.<li>Every time I sign up to a new service, I would need to add it to the USB stick. How many times can I pop round with a fresh stick before it becomes an imposition?</li><li>What if my friend (or their kid) accidentally wipes the drive?</li><li>If a freak lightning storms hits both our houses at the same time, I still lose everything.</li><li>Even if I did all that, I would have to give the USB stick a strong password to make sure my friend didn't betray me. So I either need to remember that, or I'm stuck in the password-manager-paradox.</li> Perhaps I could split the USB sticks between multiple friends using [Shamir's Secret Sharing]( )? That solves some problems - mostly the accidental losses and remembering a strong password - but creates *even more* issues. Now I have to do a lot more admin *and* worry about all my friends conspiring against me!## [Phone Home](#phone-home )One of the weakest forms of identity is the humble phone number. Several of my accounts use my mobile number to text me authorisation codes. SMS isn't the most secure way to deliver passwords - it can be intercepted or the SIM can be swapped to one controlled by an attacker. But, *if* I can get my phone number back, I stand a chance of getting in to my email and perhaps some other services. That's a weakness in my security posture. But one I may need to take advantage of. The only question is - how do I prove to the staff at my local phone shop that I am the rightful owner of a SIM card which is now little more than soot? Perhaps I can just rock up and say "Don't you know who I am?!?!" I know, I'll show them my passport! [](image ) Oh.## [Bootstrapping of trust](#bootstrapping-of-trust )I am lucky. I have a nice middle-class life and know lots of professionals - doctors, lawyers, teachers - who I *hope* would be happy to vouch for me. I could use one of my friends to [confirm my identity for a replacement passport]( ). Once I have a passport, I should be able to get a SIM card with my phone number. And, I hope, some online services. I would, however, need to use a credit or debit card to apply for a replacement passport. But all of my cards are melted to slag - and I can't prove to the bank that I am who I say I am because I don't know my account number, password, or mother's maiden name. You see, I was "clever" and took some idiot's advice about [setting your mother's maiden name to being a random string of characters]( ). Those details are, of course, stored in my inaccessible password manager! Hopefully one of my friends will be prepared to lend me the £75.50 to get a new passport. I'll just call up one of my friends. Hmmm… now, where did I store their phone number? Oh.## [Starting over](#starting-over )Again, I'm lucky. I live relatively close to some friends and family. And I'm confident that they'd be gracious enough to pay an emergency cab fare if I started hammering on their door at silly o'clock in the morning. With their help, I think I could probably call up enough insurance companies to figure out which one covered the property. I would hope the insurance company would have some way of validating with the emergency services that the house is, indeed, a smoking crater. I don't know if that would get me emergency cash, or if I'd have to rely on friends until I get access to my bank account. I assume my credit card companies can probably be convinced to send out replacement cards. But will they also be willing to change my address - or will the card go to the pile of ashes which was formerly my home? I don't know whether my insurance policy covers me for access to digital files. Even if it did, I'm not sure how they can force a company like - say - Google to give me access to my account. It isn't like Google went through a KYC (Know Your Customer) process when I signed up.## [Code Is Law](#code-is-law )This is where we reach the limits of the "Code Is Law" movement. In the boring analogue world - I am pretty sure that I'd be able to convince a human that I am who I say I am. And, thus, get access to my accounts. I may have to go to court to force a company to give me access back, but it is *possible*. But when things are secured by an unassailable algorithm - I am out of luck. No amount of pleading will let me without the correct credentials. The company which provides my password manager simply doesn't have access to my passwords. There is no-one to convince. Code is law. Of course, if I can wangle my way past security, an evil-doer could also do so. So which is the bigger risk:<li>An impersonator who convinces a service provider that they are me?</li><li>A malicious insider who works for a service provider?</li><li>Me permanently losing access to all of my identifiers?</li> I don't know the answer to that. If you have a strong opinion, please let me know in the comment section. In the meantime, please rest assured that my home is still standing. But, if you can, please donate generously to the [DEC's Ukraine Humanitarian Appeal](https://donation.dec.org.uk/ukraine-humanitarian-appeal ) #2fa #passwords #security
**The unreasonable effectiveness of simple HTML** I've told this story at conferences - but due to *the general situation* I thought I'd retell it here. A few years ago I was doing policy research in a housing benefits office in London. They are singularly unlovely places. The walls are brightened up with posters offering helpful services for people fleeing domestic violence. The security guards on the door are cautiously indifferent to anyone walking in. The air is filled with tense conversations between partners - drowned out by the noise of screaming kids. In the middle, a young woman sits on a hard plastic chair. She is surrounded by canvas-bags containing her worldly possessions. She doesn't look like she is in a great emotional place right now. Clutched in her hands is a games console - a PlayStation Portable. She stares at it intensely; blocking out the world with Candy Crush. Or, at least, that's what I thought. Walking behind her, I glance at her console and recognise the screen she's on. She's connected to the complementary WiFi and is browsing the [GOV.UK pages on Housing Benefit]( ). She's not slicing fruit; she's arming herself with knowledge. The PSP's web browser is - charitably - [pathetic](https://playstationdev.wiki/pspdevwiki/index.php?title=Webbrowser ). It is slow, frequently runs out of memory, and can only open 3 tabs at a time. But the GOV.UK pages are written in simple HTML. They are designed to be lightweight and will work even on rubbish browsers. They have to. This is for everyone. Not everyone has a big monitor, or a multi-core CPU burning through the teraflops, or a broadband connection. The photographer Chase Jarvis coined the phrase "[the best camera is the one that’s with you]( )". He meant that having a crappy instamatic with you at an important moment is better than having the best camera in the world locked up in your car. The same is true of web browsers. If you have a smart TV, it probably has [a crappy browser]( ). My old car had [a built-in crappy web browser]( ). Both are painful to use - but *they work!* If your laptop and phone both got stolen - how easily could you conduct online life through the worst browser you have? If you have to file an insurance claim online - will you get sent a simple HTML form to fill in, or a DOCX which won't render? What vital information or services are forbidden to you due to being trapped in PDFs or horrendously complicated web sites? Are you developing public services? Or a system that people might access when they're in desperate need of help? Plain HTML works. A small bit of simple CSS will make look decent. JavaScript is probably unnecessary - but can be used to progressively enhance stuff. Add alt text to images so people paying per MB can understand what the images are for (and, you know, accessibility). Go sit in an uncomfortable chair, in an uncomfortable location, and stare at an uncomfortably small screen with an uncomfortably outdated web browser. How easy is it to use the websites you've created? I chatted briefly to the young woman afterwards. She'd been kicked out by her parents and her friends had given her the bus fare to the housing benefits office. She had nothing but praise for how helpful the staff had been. I asked about the PSP - a hand-me-down from an older brother - and the web browser. Her reply was "It's shit. But it worked." I think that's all we can strive for.Here are some stats on games consoles visiting GOV.UK > [ > Matt Hobbs (@TheRealNooshu@hachyderm.io) > @TheRealNooshu]( ) > []( ) > [> Replying to @TheRealNooshu](https://twitter.com/TheRealNooshu/status/1356192029211054081 )> Interestingly we have 3,574 users visiting > [> GOV.UK]( )> on games consoles: > • Xbox - 2,062 > • Playstation 4 - 1,457 > • Playstation Vita - 25 > • Nintendo WiiU - 14 > • Nintendo 3DS - 16 > 20/22 > [> ❤️ 29](https://twitter.com/TheRealNooshu/status/1356192030695845889 )> [> 💬 1](https://twitter.com/TheRealNooshu/status/1356192030695845889 )> [> ♻️ 0](https://twitter.com/TheRealNooshu/status/1356192030695845889 )> [> 10:45 - Mon 01 February 2021](https://twitter.com/TheRealNooshu/status/1356192030695845889 ) #HTML5 #web #WeekNotes #work
## Scammers registering date-based domain names Yesterday, January 2nd, my wife received a billing alert from her phone provider. Luckily, she's not with EE - because it's a pretty convincing text. That domain name is specifically designed to include the day's date. If you're stood up on a crowded train, with your phone screen cracked, would you notice that a . is where a / should be? A quick look at the URl shows a trusted domain at the start - followed by today's date. It starts with https:// - that means it's secure, right? Is .info even recognisable as Top Level Domain? Scammers know these domains get blocked pretty quickly - so there's no point registering a generic name like billing-pdf.biz only to have it burned within a day. By the time I'd fired up a VM to inspect it, major browsers were already blocking the site as suspicious. Is there any way to stop this? No, not really. Domain names are cheap - you can buy a new .info for a couple of quid. The https:// [certificate was freely provided by Let's Encrypt]( ). The site was probably hosted somewhere cheap, and whose support staff are asleep when abuse reports come in from the UK. And that's the price we pay for anyone being able to buy their own domain and run their own secure site. Money and technical expertise used to be strong barriers to prevent people from registering scam domains. But those days are long gone. There are no technical gatekeepers to keep us safe. We have to rely on our own wits. #phishing #scam #spam
## <input type="country" /> Recently, Lea Verou asked an important question about whether HTML should have a standardised way of letting users select a country from a list. > [ > Lea Verou > @LeaVerou]( ) > []( ) > HTML Idea: <input type="country"> which would become a searchable dropdown with all countries and their flags. > Wouldn't that be awesome?> [> ❤️ 1,863> 💬 113> 🔁 0> 13:17 - Sat 21 October 2017](https://twitter.com/LeaVerou/status/921727157705035776 )You can read through the conversation and make your own mind up (while also marvelling at the witless mansplainers) - but I'd like to give you my considered take on it. (Disclaimer - I'm an editor on the HTML 5.3 spec and I work for the UK Government. This is a personal blog post and doesn't represent the views of my employers, associates, or friends.)## [Who Are You?](#who-are-you )Let's start with the big one. What is a country? This is about as contentious as it gets! It involves national identities, international politics, and hereditary relationships. Scotland, for example, is a country. [That is a (fairly) uncontentious statement](http://www.parliament.uk/about/living-heritage/evolutionofparliament/legislativescrutiny/act-of-union-1707/overview/ ) - and yet in drop-down lists, I rarely see it mentioned. Why? Because it is one of the four countries which make up the country of the United Kingdom - and so it is usually (but not always) subsumed into that. Some countries don't recognise each other. Some believe that the other country is really part of *their* country. [Some countries don't exist]( ). There are two main schemes to classify what is and isn't a country. The first is ISO 3166-1. It provides two- and three-letter codes for every country. Well... sort of. ISO 3166 contains 249 different countries, territories, protectorates, principalities, duchies, and other bits-and-bobs. It contains the Falklands, but not Scotland. The second is... whatever your country says is another country! My friends in the Government Registers Team have published [a canonical list of every country that the UK recognises]( ). There are 199 entries. Which countries are *not* in there is left as an exercise for the reader. The UK's register of countries should allow every Government website to have the same list in a drop down. When new countries are recognised, one list needs to be updated - and then all websites automagically update. In theory. Incidentally, that list of 199 countries includes four entries for countries **which no-longer exist**. For example Yugoslavia. Which brings us to the next question...## [What's the use case?](#whats-the-use-case )The most obvious one is "I want to give a site my current address" - presumably for identification purposes or postal deliveries. But what if the use case is "I want to say where I was born"? Borders shift. Countries disappear, merge, split, change names, change flags, and do all manner of weird things which trip up your edge cases. The user may want to find the name in their own script - for example would a Greek user be looking for "Greece" or "Ελλάδα"? If a Chinese speaker wants to visit the UK, do they look in the drop-down for "英国"? International Dialling Codes - not every country is unique - +1 is used by USA, Canada, Anguilla, Dominican Republic, and dozens more. Are there [countries where there is more than one international dialling code]( )? OK, what if the user wants to select their language based on their country?## [Do You Have A Flag?]( )[🔗](#do-you-have-a-flag )It is one of the classic conventions that first-year students of user interface design are taught - [countries do not represent language]( )! Some countries have multiple official languages. Some users may not speak the language of their country. Some languages are only used for official purposes, and not by the general population. Flags *mostly* represent countries. There are people in Wales who would rather see Y Ddraig Goch rather than the [Union Jack]( ). And vice-versa. Flags can make people angry. The flag of the USA last changed in 1960 - but [Mauritania changed theirs in August 2017](https://www.washingtonpost.com/news/worldviews/wp/2017/08/08/mauritanias-president-bundles-a-patriotic-flag-change-with-abolishing-the-senate/ ). How quickly can a browser update their list of countries?## [...and yet...](#and-yet )I instinctively *like* this idea! [This isn't a new question](https://twitter.com/Glightstar/status/714203191999664129 ), nothing ever is, but I think it is an idea which has merit. One of the goals of HTML is to stop web developers having to re-invent the wheel. That's why we have lots of different &lt;input&gt; types - to reduce complexity. Colour picker Number inputs Range selector Some modern browsers support date input The challenges of a country selector are...<li>Keeping everyone happy and not causing major diplomatic incidents. Easy‽</li><li>Usability. Making sure it's easy to search for the name of a country.</li><li>Consistency. How do you indicate that this list contains historic countries?</li> None of these are insurmountable problems - but it's far from trivial. And yet... I think there is a real possibility that this could work. Millions of websites already find ways to cope with the ambiguity - perhaps browsers can too? #flag #i18n #NaBloPoMo