Data-Driven Function Calling Improvements in Large Language Model for Online Financial QA

arXiv cs.CL / 4/8/2026

💬 OpinionIdeas & Deep AnalysisTools & Practical UsageModels & Research

共有:

Key Points

The paper addresses how to improve LLM function calling for online financial question-answering by tailoring API tool use to the financial domain rather than relying on generic function-calling behavior.
It proposes a data-driven pipeline with periodic dataset construction and updates, using user-query-related samples to better match real online query patterns.
The method introduces an augmentation strategy called AugFC to explore possible function parameter values, increasing diversity and mitigating out-of-distribution issues between user queries and required API inputs.
A two-step training approach is used to teach the LLM to correctly invoke financial functions, with experiments on offline datasets and an online deployment scenario showing improved performance.
The pipeline has been adopted in YuanBao, a large-scale financial chat platform in China, indicating practical value beyond offline evaluation.

Abstract

Large language models (LLMs) have been incorporated into numerous industrial applications. Meanwhile, a vast array of API assets is scattered across various functions in the financial domain. An online financial question-answering system can leverage both LLMs and private APIs to provide timely financial analysis and information. The key is equipping the LLM model with function calling capability tailored to a financial scenario. However, a generic LLM requires customized financial APIs to call and struggles to adapt to the financial domain. Additionally, online user queries are diverse and contain out-of-distribution parameters compared with the required function input parameters, which makes it more difficult for a generic LLM to serve online users. In this paper, we propose a data-driven pipeline to enhance function calling in LLM for our online, deployed financial QA, comprising dataset construction, data augmentation, and model training. Specifically, we construct a dataset based on a previous study and update it periodically, incorporating queries and an augmentation method named AugFC. The addition of user query-related samples will \textit{exploit} our financial toolset in a data-driven manner, and AugFC explores the possible parameter values to enhance the diversity of our updated dataset. Then, we train an LLM with a two-step method, which enables the use of our financial functions. Extensive experiments on existing offline datasets, as well as the deployment of an online scenario, illustrate the superiority of our pipeline. The related pipeline has been adopted in the financial QA of YuanBao\footnote{https://yuanbao.tencent.com/chat/}, one of the largest chat platforms in China.

Black Hat USA

AI Business

Black Hat Asia

AI Business

Efficient Inference with SGLang: Text and Image Generation

The Batch

Meta's latest model is as open as Zuckerberg's private school

The Register

I Have an AI Agent That Tests My Own Product Every 3 Hours

Dev.to

Data-Driven Function Calling Improvements in Large Language Model for Online Financial QA

Key Points

Abstract

Related Articles

Black Hat USA

Black Hat Asia

Efficient Inference with SGLang: Text and Image Generation

Meta's latest model is as open as Zuckerberg's private school

I Have an AI Agent That Tests My Own Product Every 3 Hours

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer