I’ve never tried to fool or trick AI with excessively complex questions. When I tried to test it (a few different models over some period of time - ChatGPT, Bing AI, Gemini) I asked stuff as simple as “what’s the etymology of this word in that language”, “what is [some phenomenon]”. The models still produced responses ranging from shoddy to absolutely ridiculous.
completely detached from how anyone actually uses
I’ve seen numerous people use it the same way I tested it, basically a Google search that you can talk with, with similarly shit results.
Why do we expect a higher degree of trustworthiness from a novel LLM than we de from any given source or forum comment on the internet?
At what point do we stop hand-wringing over llms failing to meet some perceived level of accuracy and hold the people using it responsible for verifying the response themselves?
Theres a giant disclaimer on every one of these models that responses may contain errors or hallucinations, at this point I think it’s fair to blame the user for ignoring those warnings and not the models for not meeting some arbitrary standard.
I’ve never tried to fool or trick AI with excessively complex questions. When I tried to test it (a few different models over some period of time - ChatGPT, Bing AI, Gemini) I asked stuff as simple as “what’s the etymology of this word in that language”, “what is [some phenomenon]”. The models still produced responses ranging from shoddy to absolutely ridiculous.
I’ve seen numerous people use it the same way I tested it, basically a Google search that you can talk with, with similarly shit results.
Why do we expect a higher degree of trustworthiness from a novel LLM than we de from any given source or forum comment on the internet?
At what point do we stop hand-wringing over llms failing to meet some perceived level of accuracy and hold the people using it responsible for verifying the response themselves?
Theres a giant disclaimer on every one of these models that responses may contain errors or hallucinations, at this point I think it’s fair to blame the user for ignoring those warnings and not the models for not meeting some arbitrary standard.