科大讯飞 Spark Scilit-X1-13B 基于最新一代科大讯飞基础模型构建,并针对源自科学文献的多项核心任务进行了训练。作为一款专为学术研究场景打造的大型语言模型,它在论文辅助阅读、学术翻译、英语润色和评论生成等方面均表现出色,旨在为研究人员、教师和学生提供高效、精准的智能辅助。
核心特性
| 参数 | 数值 |
|---|---|
| 总参数量 | 130亿 |
| 上下文长度 | 32K |
| 窗口长度 | 32K |
| 网络层数 | 40 |
| 注意力隐藏维度 | 5120 |
| 注意力头数 | 40 |
| 词汇表大小 | 13万 |
| 注意力机制 | GQA |
| 激活函数 | GeLU |
科大讯飞 Spark Scilit-X1-13B 架构的模型规格说明。
| 任务 | 指标 | Spark-Scilit-X1-13B | Qwen3-32B | Qwen3-Next-80B-A3B | DeepSeek-R1 | O3 |
|---|---|---|---|---|---|---|
| 论文辅助阅读 | MOS | 4.04 | 3.98 | 4.06 | 4.01 | 4.1 |
| 学术翻译 | MOS | 4.18 | 4.04 | 4.08 | 4.12 | 4.22 |
| 英语润色 | MOS | 4.22 | 4.11 | 4.2 | 3.98 | 4.28 |
| 评论生成 | MOS | 3.88 | 3.5 | 3.7 | 3.68 | 4.01 |
所有指标均为人类评估的平均意见得分(MOS)(1–5分制)。 评估说明:
需求条件
cd /path/to/Spark-Scilit-X1-13B
# We recommend using Python 3.10
pip install -r requirements.txt
pip install .from transformers import AutoModelForCausalLM
from tokenizer_spark import SparkTokenizer
# Load model and tokenizer
model_name = "iflytek/Spark-Scilit-X1-13B"
tokenizer = SparkTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype=torch.bfloat16,
device_map="auto"
)
# Reactive
chat_history = [
{
"role" : "system",
"content" : "你能够回答用户的各种问题,回答问题能够角度全面、表述专业、重点突出。"
},
{
"role" : "user",
"content" : "你将进行论文片段的翻译任务,请将给定论文片段翻译成中文。论文片段如下:Super-hydrophobic delivery (SHD) is an efficient approach to enrich trace analytes into hot spot regions for ultrasensitive surface-enhanced Raman scattering (SERS) detection. In this article, we propose an efficient and simple method to prepare a highly uniform SHD-SERS platform of high performance in trace detection, named as “silver-nanoparticle-grafted silicon nanocones” (termed AgNPs/SiNC) platform. It is fabricated via droplet-confined electroless deposition on the super-hydrophobic SiNC array. The AgNPs/SiNC platform allows trace analytes enriched into hot spots formed by AgNPs, leading to excellent reproducibility and sensitivity. The relative standard deviation (RSD) for detecting R6G (10⁻⁶ M) is down to 4.70%, and the lowest detection concentration for R6G is 10⁻⁸ M. Moreover, various contaminants in complex liquid environments, such as crystal violet (10⁻⁶ M) in lake water, melamine (10⁻⁶ M) in liquid milk, and methyl parathion (10⁻⁶ M) in tap water, can be detected using the SERS platform. This result demonstrates the great potential of the AgNPs/SiNC platform in the fields of food safety and environmental monitoring."
}]
inputs = tokenizer.apply_chat_template(
chat_history,
tokenize=True,
return_tensors="pt",
add_generation_prompt=True
).to(model.device)
outputs = model.generate(
inputs["input_ids"],
max_new_tokens=32768,
top_k=10,
do_sample=True,
repetition_penalty=1.1,
temperature=0.5,
eos_token_id=5,
pad_token_id=0,
output_attentions=True
)
response = tokenizer.decode(
outputs[0][inputs.shape[1] :],
skip_special_tokens=True
)
print(reponse)
# Deliberative
chat_history = [
{
"role" : "system",
"content" : "你能够回答用户的各种问题,回答问题能够角度全面、表述专业、重点突出。当前是慢思考模式,请你先深入剖析给出问题的关键要点与内在逻辑,生成思考过程,再根据思考过程回答给出问题。思考过程以<unused6>开头,在结尾处用<unused7>标注结束,<unused7>后为基于思考过程的回答内容。"
}
,
{
"role" : "user",
"content" : "你将进行论文片段的翻译任务,请将给定论文片段翻译成中文。论文片段如下:Super-hydrophobic delivery (SHD) is an efficient approach to enrich trace analytes into hot spot regions for ultrasensitive surface-enhanced Raman scattering (SERS) detection. In this article, we propose an efficient and simple method to prepare a highly uniform SHD-SERS platform of high performance in trace detection, named as “silver-nanoparticle-grafted silicon nanocones” (termed AgNPs/SiNC) platform. It is fabricated via droplet-confined electroless deposition on the super-hydrophobic SiNC array. The AgNPs/SiNC platform allows trace analytes enriched into hot spots formed by AgNPs, leading to excellent reproducibility and sensitivity. The relative standard deviation (RSD) for detecting R6G (10⁻⁶ M) is down to 4.70%, and the lowest detection concentration for R6G is 10⁻⁸ M. Moreover, various contaminants in complex liquid environments, such as crystal violet (10⁻⁶ M) in lake water, melamine (10⁻⁶ M) in liquid milk, and methyl parathion (10⁻⁶ M) in tap water, can be detected using the SERS platform. This result demonstrates the great potential of the AgNPs/SiNC platform in the fields of food safety and environmental monitoring."
}]
inputs = tokenizer.apply_chat_template(
chat_history,
tokenize=True,
return_tensors="pt",
add_generation_prompt=True
).to(model.device)
outputs = model.generate(
inputs["input_ids"],
max_new_tokens=32768,
top_k=10,
do_sample=True,
repetition_penalty=1.1,
temperature=0.5,
eos_token_id=5,
pad_token_id=0,
output_attentions=True
)
response = tokenizer.decode(
outputs[0][inputs.shape[1] :],
skip_special_tokens=True
)
print(reponse)
科大讯飞Spark Scilit-X1-13B基于Apache 2.0许可协议授权。