如何解决“列的类型值与先前观察到的类型不同”

时间:2019-01-30 10:31:42

标签: c# ml.net

我正在创建一个机器学习模型,我想从文本文件中读取不同的值并使用CustomMapping处理它们。运行System.InvalidOperationException时,程序将抛出CustomMapping

我已经将原因缩小到CustomMapping函数中,正在读取的文本文件没有任何空值。我已经仔细检查了所有变量声明,并确保它们都使用正确的类型。我的直觉是,自定义映射将1和0解释为布尔值,而不是浮点数,尽管我认为没有必要这样做。

道歉,但问题是关于类型问题,所以我认为展示所有内容很重要。

我的管道:

var pipeline = context.Transforms.CustomMapping<ProfileInput, ProfileProcess>(ProfileMapping.Transform, nameof(ProfileMapping))
.Append(context.Transforms.Concatenate("Features", "isBanned", "profileVisibility", "profileConfigured", "lastLogOff", "commentPermission", "timeCreated", "friendCount", "gameBannedFriendsCount", "vacBannedFriendsCount", "gameBannedFriendsPercent", "vacBannedFriendsPercent"));

我的CustomMapping:

public static void Transform(ProfileInput input, ProfileProcess output)
{
  if (input.numberGameBans > 0 || input.numberVacBans > 0)
    output.isBanned = false;

  output.gameBannedFriendsPercent = input.gameBannedFriendsCount / input.friendCount;
  output.vacBannedFriendsPercent = input.vacBannedFriendsCount / input.friendCount;
  output.profileVisibility = input.profileVisibility;
  output.profileConfigured = input.profileConfigured;
  output.lastLogOff = input.lastLogOff;
  output.commentPermission =  input.commentPermission;
  output.timeCreated = input.timeCreated;
  output.friendCount = input.friendCount;
  output.gameBannedFriendsCount = input.gameBannedFriendsCount;
  output.vacBannedFriendsCount = input.vacBannedFriendsCount;
}

ProfileInput:

public class ProfileInput
{
  [LoadColumn(0)]
  public bool commentPermission;
  [LoadColumn(1)]
  public float lastLogOff;
  [LoadColumn(2)]
  public bool profileConfigured;
  [LoadColumn(3)]
  public float profileVisibility;
  [LoadColumn(4)]
  public float timeCreated;
  [LoadColumn(5)]
  public float numberVacBans;
  [LoadColumn(6)]
  public float numberGameBans;
  [LoadColumn(7)]
  public float vacBannedFriendsCount;
  [LoadColumn(8)]
  public float gameBannedFriendsCount;
  [LoadColumn(9)]
  public float friendCount;
}

ProfileProcess:

public class ProfileProcess
{
  public bool isBanned;
  public float profileVisibility;
  public bool profileConfigured;
  public float lastLogOff;
  public bool commentPermission;
  public float timeCreated;
  public float friendCount;
  public float gameBannedFriendsCount;
  public float vacBannedFriendsCount;
  public float gameBannedFriendsPercent;
  public float vacBannedFriendsPercent;
}

运行pipeline.fit()时出现以下异常:

  

System.InvalidOperationException:'列'profileVisibility'具有   R4的值与先前观察到的Bool类型不同。'

我希望它能够成功完成代码而不会引发错误,实际输出将是TransformerChain模型-我知道管道还没有训练器,因此该模型将一无是处现在。

2 个答案:

答案 0 :(得分:1)

context.Transforms.Concatenate串联相同类型的列。该类型由第一个输入列定义,在您的情况下为“ isBanned”。由于那是布尔值,因此Concatenate期望下一个值也是布尔值。

如果要将列连接在一起,而无需对其进行任何其他预处理,则可以将它们直接加载为浮点数(0/1),而不是布尔值。

答案 1 :(得分:1)

您需要做的就是OneHotEncode您的非浮动列

.Append(context.Transforms.Categorical.OneHotEncoding(outputColumnName: "isBannedEncoded", inputColumnName: "isBanned"))
.Append(context.Transforms.Categorical.OneHotEncoding(outputColumnName: "profileConfiguredEncoded", inputColumnName: "profileConfigured"))
.Append(context.Transforms.Categorical.OneHotEncoding(outputColumnName: "commentPermissionEncoded", inputColumnName: "commentPermission"))

.Append(context.Transforms.Concatenate("Features", "isBannedEncoded", "profileVisibility", "profileConfiguredEncoded", "lastLogOff", "commentPermissionEncoded", "timeCreated", "friendCount", "gameBannedFriendsCount", "vacBannedFriendsCount", "gameBannedFriendsPercent", "vacBannedFriendsPercent"));

希望有帮助