1 link tagged with all of: language-models + neural-activations + model-transparency
Click any tag below to further narrow down your results
Links
Researchers used a “concept injection” method to compare Claude’s self-reported thoughts with its actual neural activity. They found Claude Opus 4 and 4.1 sometimes detect and control injected concepts, suggesting limited but real introspective abilities that improve with model capacity.