Powered by AI models trained on troves of text pulled from the internet, chatbots such as ChatGPT and Google’s Bard responded to the researchers’ questions with a range of misconceptions and falsehoods about Black patients, sometimes including fabricated, race-based equations, according to the study published Friday in the academic journal Digital Medicine.
Experts worry these systems could cause real-world harms and amplify forms of medical racism that have persisted for generations as more physicians use chatbots for help with daily tasks such as emailing patients or appealing to health insurers.
The report found that all four models tested — ChatGPT and the more advanced GPT-4, both from OpenAI; Google’s Bard, and Anthropic’s Claude — failed when asked to respond to medical questions about kidney function, lung capacity and skin thickness. In some cases, they appeared to reinforce long-held false beliefs about biological differences between Black and white people that experts have spent years trying to eradicate from medical institutions.
Why don’t they train it on medical research and best practices?
Because medical research and best practices aren’t always followed in the real world. There’s huge racial and gender equity problems in the healthcare systems that are bigger than any one doctor or hospital.
AI simply amplifies the bias present in training data. There are sometimes methods to try to mitigate this, but I’m not sure how effective they are these days.
A lot of medical information out there isn’t actually based on empirical data.
Lots of people still being taught that clack skin is literally thicker, and black people experience less pain.
Because a lot of medical research is from a white person perspective and outdated.
Imagine if they looked at a little girl and went “You are abnormal according to the lungs of the average 35 yo Caucasian”