Welcome Guest! To enable all features please Login or Register.

Notification

Icon
Error

Embedded resources ending up with extra characters
joshevensen
#1 Posted : Wednesday, January 30, 2013 2:57:33 PM(UTC)
Rank: Newbie

Groups: Registered
Joined: 6/26/2012(UTC)
Posts: 6

When running tests that use an embedded resource to store a test Json file, I'm getting

Code:
System.Runtime.Serialization.SerializationException: There was an error deserializing the object of type [TYPE]. Encountered unexpected character 'ï'. ---> System.Xml.XmlException: Encountered unexpected character 'ï'.
   at System.Xml.XmlExceptionHelper.ThrowXmlException(XmlDictionaryReader reader, XmlException exception)
   at System.Runtime.Serialization.Json.XmlJsonReader.ReadAttributes()
   at System.Runtime.Serialization.Json.XmlJsonReader.ReadNonExistentElementName


if i run the test in ncrunch's debugger and use System.Text.Encoding.Default to GetString on the binary being deserialized, I see the file does indeed start with that character
Code:
{\r\n   \"Patient\":


however, if I run from MSTest, the test passes fine and the binary DOESNT have that character at the start.

Anything I'm missing?
Remco
#2 Posted : Wednesday, January 30, 2013 8:42:58 PM(UTC)
Rank: NCrunch Developer

Groups: Administrators
Joined: 4/16/2011(UTC)
Posts: 7,177

Thanks: 968 times
Was thanked: 1298 time(s) in 1203 post(s)
Hi, thanks for posting!

When NCrunch works with files that are open in Visual Studio, it extracts their content from the IDE in unicode and transfer them to the workspace, writing them back in unicode with the byte order marks. Where the file uses a different encoding, NCrunch will detect the encoding and use it when writing the file back with the IDE content.

However, I can see how this could cause problems if you were storing content in a resource file that has a specific encoding but does not make use of byte order marks to declare this encoding. It's theoretically possible for such a file to be written to the workspace in a different state to how it was supplied in your original solution.

Although the behaviour is technically in error, there isn't really anything that can be done in the side of NCrunch to fix this. NCrunch has no firm knowledge of how the file is encoded, so when it goes to reproduce the file using data supplied only from the IDE, it can only guess how the file should be structured on disk.

As you're working with the file in binary, I think this raises the key question of whether you consider the file to be a binary file or a text file.

If the file is considered binary, then it shouldn't be opened in the IDE while using NCrunch as the text editor will cause the data stored in the file to be sourced by the text extracted from the IDE. I would perhaps suggest using a file extension that identifies a binary file, so that it is clear that the file is being interpreted by code that expects a strict binary form.

If the file is considered text, I suggest making an allowance for the byte order marks in your code. You may be able to use a different method of converting from binary to string that takes the order marks into consideration. This will then also make your code more resilient to sudden changes in the resource file caused by text editors that may presume to add these marks where your code otherwise wouldn't expect them.


I hope this makes sense!


Cheers,

Remco
joshevensen
#3 Posted : Thursday, January 31, 2013 3:03:06 PM(UTC)
Rank: Newbie

Groups: Registered
Joined: 6/26/2012(UTC)
Posts: 6

Remco;3616 wrote:
Hi, thanks for posting!
If the file is considered binary, then it shouldn't be opened in the IDE while using NCrunch as the text editor will cause the data stored in the file to be sourced by the text extracted from the IDE. I would perhaps suggest using a file extension that identifies a binary file, so that it is clear that the file is being interpreted by code that expects a strict binary form.


does NCrunch look at file extensions for embedded resources?
joshevensen
#4 Posted : Thursday, January 31, 2013 3:04:35 PM(UTC)
Rank: Newbie

Groups: Registered
Joined: 6/26/2012(UTC)
Posts: 6

Saving the file from an outside editor did the trick, but when i look at the actual saved file after editing in visual studio in a hex editor it doesn't have any encoding characters at the start.
Remco
#5 Posted : Thursday, January 31, 2013 8:23:54 PM(UTC)
Rank: NCrunch Developer

Groups: Administrators
Joined: 4/16/2011(UTC)
Posts: 7,177

Thanks: 968 times
Was thanked: 1298 time(s) in 1203 post(s)
NCrunch doesn't care about the file extensions of items in your solution - but other tools likely will, and people will often form associations on what can and cannot be edited by a text editor (i.e. most people will happily edit a .xml file in notepad, though will avoid doing so with a .dat file). Making sure the file is represented according to the way it is processed will help ensure the format stays consistent.

Because NCrunch relies on the .NET framework to detect the encoding of files, it's quite possible that it will work fine without the byte order marks provided the format of the file is consistent. Based on the number of variables, I'm not sure if I can accurately speculate on everything that has just happened on your side by re-saving this file .. although I'm glad that you managed to find a solution :) It may be worth also checking that editing the file in VS now doesn't cause the problem to resurface.
Users browsing this topic
Guest
Forum Jump  
You cannot post new topics in this forum.
You cannot reply to topics in this forum.
You cannot delete your posts in this forum.
You cannot edit your posts in this forum.
You cannot create polls in this forum.
You cannot vote in polls in this forum.

YAF | YAF © 2003-2011, Yet Another Forum.NET
This page was generated in 0.038 seconds.
Trial NCrunch
Take NCrunch for a spin
Do your fingers a favour and supercharge your testing workflow
Free Download