LLM 보안

복잡한 해킹보다 ‘안녕하세요’가 더 위험? AI 공격 성공률…

6월 17, 2025

LLM 유해성 공격 전략에 대한 실증적 분석 오픈AI의 챗GPT와 앤트로픽의 클로드 등 대규모 언어 모델(Large Language Models, LLMs)의…

Artificial Intelligence in Aerospace and Defense

항공우주업계가 2025년 AI 상용화에 올인하는 이유

6월 11, 2025

Artificial Intelligence in Aerospace and Defense 항공기 17,000대 생산 지체에 직면한 업계, AI로 생산혁신 모색 항공우주 및 방위(A&D)…

Split-screen showing normal AI vs deceptive AI with distinct visual signatures and warning indicators. SafetyNet monitoring system with four different detection methods catching AI misbehavior in real-time, 96% accuracy display.

AI도 거짓말할 때 ‘티’가 난다… 유해 답변 생성…

5월 29, 2025

SafetyNet: Detecting Harmful Outputs in LLMs by Modeling and Monitoring Deceptive Behaviors AI의 ‘나쁜 생각’ 미리 알아채는 기술,…

Constitutional Classifiers: Defending against Universal Jailbreaks across Thousands of Hours of Red Teaming

AI 안전성 높이는 ‘헌법 분류기’ 개발…앤트로픽 연구진, 3000시간…

2월 4, 2025

Constitutional Classifiers: Defending against Universal Jailbreaks across Thousands of Hours of Red Teaming 인공지능 연구기업 앤트로픽(Anthropic)이 대규모 언어모델(LLM)의…

Lessons from red teaming 100 generative AI products

마이크로소프트가 밝힌 AI 안전성의 현주소…인간의 판단이 더욱 중요해진다

1월 14, 2025

Lessons from red teaming 100 generative AI products AI 레드팀이 발견한 8가지 핵심 교훈 마이크로소프트 AI 레드팀(AIRT)이 100개…

AUTODAN-TURBO: A LIFELONG AGENT FOR STRATEGY SELF-EXPLORATION TO JAILBREAK LLMS

AI 안전성 우회하는 ‘AutoDAN-Turbo’, LLM 공격 성공률 최대…

10월 21, 2024

대규모 언어 모델(Large Language Models, LLM)의 급속한 발전과 함께 이를 악용하려는 시도 또한 증가하고 있다. 최근 위스콘신 매디슨…

LLM 보안

복잡한 해킹보다 ‘안녕하세요’가 더 위험? AI 공격 성공률…

항공우주업계가 2025년 AI 상용화에 올인하는 이유

AI도 거짓말할 때 ‘티’가 난다… 유해 답변 생성…

AI 안전성 높이는 ‘헌법 분류기’ 개발…앤트로픽 연구진, 3000시간…

마이크로소프트가 밝힌 AI 안전성의 현주소…인간의 판단이 더욱 중요해진다

AI 안전성 우회하는 ‘AutoDAN-Turbo’, LLM 공격 성공률 최대…

Trending

소비자 77% “연말 쇼핑 시, AI한테 물어볼…

“우리 부모님도 AI 쓴다”… 50대-70대 AI 사용률…

GPT-5, ‘기억상실증 걸린 천재’ 수준… MIT·스탠퍼드 연구진이…