AI for Information Ecosystems

Last updated on Jul 18, 2026

My research interests lie in building AI systems for information ecosystems: how people create, access, evaluate, and act on information. News media has been my primary application domain, and I have explored both the opportunities and risks of AI systems in information-intensive workflows.

I am particularly interested in three themes: building AI systems for real-world information workflows, evaluating the trustworthiness and risks of large language models, and enabling broader adoption of data science through competitions, books, and community activities.

Building AI Systems for Information-Intensive Workflows

There are many opportunities to apply AI and information technology to information-intensive workflows, including content creation, knowledge organization, recommendation, analysis, and operational support. In my work in news media, I collaborated closely with editors, engineers, and business teams to develop and evaluate AI systems in real-world production settings. Some of these projects have been published externally as press releases, research papers, and technical presentations.

Domain-specific pre-trained models Press release
Improving operational efficiency:
- News summarization Journal of NLP
- Entity linking system Journal of NLP
Creating new experiences:
- Recommendation from daily scenery KES-2025
- Crossword puzzle generation CIKM 2023
Service & user analysis:
- Effects of delivery formats IC2S2 2026
- Semantic shift analysis IC2S2 2023
- Reading time estimation BigData 2022 Industrial & Government Track

Evaluating Trust and Risk in Large Language Models

Large language models raise important questions about trust, privacy, copyright, security, and evaluation. One of my research interests is understanding how training data remains in model behavior and how such risks can be measured empirically. I have conducted experiments to quantify memorization, primarily using Japanese newspaper articles as a high-quality, real-world corpus. I have also published survey papers and released tools for benchmarking membership inference attacks and related memorization risks.

Survey paper ACL 2023 Workshop Transactions of JSAI
Benchmarking membership inference attacks ACL 2026 Demo
Experiments on Japanese newspaper INLG 2024 ACL 2025 Workshop
Monitoring time-series performance degradation AACL-IJCNLP 2022 Journal of NLP

Enabling Data Science and Developer Communities

Data science competitions, such as Kaggle, are a powerful way to make machine learning and AI more accessible, practical, and community-driven. I have contributed to this ecosystem by publishing books, organizing competitions, writing newsletters, and supporting educational activities. Through these activities, I aim to help a broader range of developers, students, researchers, and practitioners learn how to apply AI technologies to real-world problems.

These activities — books, newsletters, hosted competitions, and community events — are summarized on the DevRel page. See also Data Science Competitions for results as a participant, and the Committee section in Publications for committee roles.

Reference

You can see Publications for more details.