Searching and designing new materials play crucial roles in the development of energy storage devices. In today's world where machine learning technology has shown strong predictive ability for various tasks, the combination with machine learning technology will accelerate the process of material development. Herein, we develop ESM Cloud Toolkit for energy storage materials based on Mat Elab platform, which is designed as a convenient and accurate way to automatically record and save the raw data of scientific research. The ESM Cloud Toolkit includes multiple features such as automatic archiving of computational simulation data, post-processing of experimental data, and machine learning applications. It makes the entire research workflow more automated and reduces the entry barrier for the application of machine learning technology in the domain of energy storage materials. It integrates data archive, traceability, processing, and reutilization, and allows individual research data to play a greater role in the era of AI.
Alfalfa(Medicago sativa.L.)is a globally significant autotetraploid legume forage crop.However,despite its importance,establishing efficient gene editing systems for cultivated alfalfa remains a formidable challenge.In this study,we pioneered the development of a highly effective ultrasonic-assisted leaf disc transformation system for Gongnong 1 alfalfa,a variety widely cultivated in Northeast China.Subsequently,we created a single transcript CRISPR/Cas9(CRISPR_2.0)toolkit,incorporating multiplex gRNAs,designed for gene editing in Gongnong 1.Both Cas9 and gRNA scaffolds were under the control of the Arabidopsis ubiquitin-10 promoter,a widely employed polymeraseⅡconstitutive promoter known for strong transgene expression in dicots.To assess the toolkit’s efficiency,we targeted PALM1,a gene associated with a recognizable multifoliate phenotype.Utilizing the CRISPR_2.0 toolkit,we directed PALM1 editing at two sites in the wild-type Gongnong 1.Results indicated a 35.1%occurrence of editing events all in target 2 alleles,while no mutations were detected at target 1 in the transgenic-positive lines.To explore more efficient sgRNAs,we developed a rapid,reliable screening system based on Agrobacterium rhizogenes-mediated hairy root transformation,incorporating the visible reporter MtLAP1.This screening system demonstrated that most purple visible hairy roots underwent gene editing.Notably,sgRNA3,with an 83.0%editing efficiency,was selected using the visible hairy root system.As anticipated,tetra-allelic homozygous palm1 mutations exhibited a clear multifoliate phenotype.These palm1 lines demonstrated an average crude protein yield increase of 21.5%compared to trifoliolate alfalfa.Our findings highlight the modified CRISPR_2.0 system as a highly efficient and robust gene editing tool for autotetraploid alfalfa.
The release of AlphaFold2 has sparked a rapid expansion in protein model databases.Efficient protein structure retrieval is crucial for the analysis of structure models,while measuring the similarity between structures is the key challenge in structural retrieval.Although existing structure alignment algorithms can address this challenge,they are often time-consuming.Currently,the state-of-the-art approach involves converting protein structures into three-dimensional(3D)Zernike descriptors and assessing similarity using Euclidean distance.However,the methods for computing 3D Zernike descriptors mainly rely on structural surfaces and are predominantly web-based,thus limiting their application in studying custom datasets.To overcome this limitation,we developed FP-Zernike,a user-friendly toolkit for computing different types of Zernike descriptors based on feature points.Users simply need to enter a single line of command to calculate the Zernike descriptors of all structures in customized datasets.FP-Zernike outperforms the leading method in terms of retrieval accuracy and binary classification accuracy across diverse benchmark datasets.In addition,we showed the application of FP-Zernike in the construction of the descriptor database and the protocol used for the Protein Data Bank(PDB)dataset to facilitate the local deployment of this tool for interested readers.Our demonstration contained 590,685 structures,and at this scale,our system required only 4-9 s to complete a retrieval.The experiments confirmed that it achieved the state-of-the-art accuracy level.FP-Zernike is an open-source toolkit,with the source code and related data accessible at https://ngdc.cncb.ac.cn/biocode/tools/BT007365/releases/0.1,as well as through a webserver at http://www.structbioinfo.cn/.
Junhai QiChenjie FengYulin ShiJianyi YangFa ZhangGuojun LiRenmin Han
Small RNAs(sRNAs),found extensively in plants,play an essential role in plant growth and development.Although various sRNA analysis tools have been developed for plants,the use of most of them depends on programming and command-line environments,which is a challenge for many wet-lab biologists.Furthermore,current sRNA analysis tools mostly focus on the analysis of certain type of sRNAs and are resource-intensive,normally demanding an immense amount of time and effort to learn the use of numerous tools or scripts and assemble them into a workable pipeline to get the final results.Here,we present sRNAminer,a powerful stand-alone toolkit with a user-friendly interface that integrates all common functions for the analysis of three major types of plant sRNAs:microRNAs(miRNAs),phased small interfering RNAs(phasiRNAs),and heterochromatic siRNAs(hc-siRNAs).We constructed a curated or"golden"set of MIRNA and PHAS loci,which was used to assess the performance of sRNAminer in comparison to other existing tools.The results showed that sRNAminer outperformed these tools in multiple aspects,highlighting its functionality.In addition,to enable an efficient evaluation of sRNA annotation results,we developed Integrative Genomics Viewer(IGV)-sRNA,a modified genome browser optimized from IGV and we incorporated it as a functional module in sRNAminer.IGV-sRNA can display a wealth of sRNA-specific features,enabling a more comprehensive understanding of sRNA data.sRNAminer and IGV-sRNA are both platform-independent software that can be run under all operating systems.They are now freely available at https://github.com/kli28/sRNAminer and https://gitee.com/CJchen/IG V-sRNA.
With the growing availability of data within various scientific domains,generative models hold enormous potential to accelerate scientific discovery.They harness powerful representations learned from datasets to speed up the formulation of novel hypotheses with the potential to impact material discovery broadly.We present the Generative Toolkit for Scientific Discovery(GT4SD).This extensible open-source library enables scientists,developers,and researchers to train and use state-of-the-art generative models to accelerate scientific discovery focused on organic material design.