Exploring Transparency and AI Assessment in LLM-Assisted Research Applications

Abstract

In this work, we further investigate the utility of using a large language model (LLM) as a research assistant to identify research grant funding opportunities that are best suited for a user-defined natural language set of capabilities. The use case is a United States Department of Defense Small Business Innovation Research (SBIR) Broad Agency Announcement (BAA). To explore principles of responsible/ethical artificial intelligence and accountability in the context of large language model-driven applications, we perform clustering on embeddings and apply a suite of metrics to compare cluster quality against Latent Semantic Indexing. Further, we use visualization techniques to depict the contents of funding opportunities that lie at the intersection of multiple capabilities. Finally, we show the importance of maintaining the human in the loop for vetted data quality.

Contribution

In this work, we push past what merely looks good to introduce qualitative and quantitative measures of information present in clusters with indications of alignment with user intention. We add explainability and interpretability to “offer information that help[s] end users understand the purposes and potential impact of an AI system” (NIST AI RMF Playbook). This research is an effort toward increasing accountability in query information retrieved from LLMs and providing the user operable insights into what is happening inside the black box model. Our aim is to promote transparency in LLM responses and thereby transfer informed decision-making power back to the user to promote responsible AI principles in LLM adoption for research purposes.