针对知识化制造环境下的自适应调度问题,提出基于状态-动作不确定性偏向Q学习(state-action uncertainty bias based Q-learning,简称SAUBQ学习)的知识化制造自适应调度策略.该策略针对传统Q学习收敛速度慢,训练时间长等问题,引入信息熵的概念定义了状态不确定性测度,据此定义了Q学习动作偏向信息函数,通过对Q学习奖励函数采用启发式回报函数设计,将动作偏向信息利用附加回报的方式融入学习系统,并证明了算法的收敛性和最优策略不变性.在学习过程中,Q学习根据偏向信息调整搜索空间,减少了Q学习必须探索的有效状态-动作对数目,同时偏向信息根据Q学习结果不断进行调整,避免了不正确的误导.经仿真实验比较,结果表明,该策略具有对动态环境的适应性和大状态空间下收敛的快速性,提高了调度效率.
为深入研究设施农作物风险条件下最优生产组合,在target-MOTAD(The minimization of total absolute deviation model)模型基础上,打破风险状态概率等值恒定限制,对target-MOTAD模型进行改进。利用南京市六合区设施农户抽样调查数据,运用target-MOTAD改进模型实证研究设施农业中"高设施,高风险,高收益"与"低设施,低风险,低收益"2类情境的设施农户的生产经营状况以及不同风险条件下最优组合种植策略。结果表明:利用伪随机数模拟风险状态发生概率的target-MOTAD改进模型研究不确定情境的设施农作物组合种植计划是正确和有效的;南京市六合区"典型设施农户"的种植结构需要调整。综合考虑风险状态的规律和生产资料投入等因素,target-MOTAD改进模型更接近现实种植情况,可为不确定情境下设施农作物种植计划决策提供借鉴。
Chinese vegetable production cooperatives supply their members, mostly smallholder farmers, with a rotation schedule for the year. Since vegetable prices are not stable throughout the year, designing a rotation schedule that maximizes expected profits, distributes farmers' profits more equitably, maintains the diversity of produce in the market, and reduces the risk of pests and diseases, requires adaptive, price-contingent rotation schedules(here, called "self-adaptive adjustment"). This study uses an agent-based simulation(ABS) to design self-adaptive rotation schedules that deliver these aims. The selfadaptive adjustment strategy was more profitable for farmers when faced with price volatility, and more equitable as well. This work provides a decision-support tool for managers of Chinese vegetable production cooperatives to provide farmers with more profitable and equitable rotation schedules.