[windev] Problem reading text files

truckleaj-windev@yahoo.co.uk truckleaj-windev at yahoo.co.uk
Thu Apr 2 08:11:08 GMT 2009


The other problem I have is that CHtmlView doesn't seem to like the UFT-8 e=
ncoding very much.

If I set the file as that, including the declaration line at the top, and r=
ight-click the view the encoding is set to UTF-8 but the italian accents sh=
ow as boxes.

The only way I have got the variuous languages to work is by using a charse=
t of windowsnnnn accordingly. Then the HTMLView renders the accents.

But if was my understanding that UTF-8 was what we should be using these da=
ys. The same file renders fine in IE7, not not in the web-browser control. =
I tried to google for CHtmlView / encoding / utf-8 and couldn't see anythin=
g about this.

So I have stayed with the relevant windowsnnnn values based upon the langua=
ge of my program UI.

Andrew




________________________________
From: Serge Wautier <serge at wautier.net>
To: WinDev <windev at windev.org>
Sent: Tuesday, 31 March, 2009 16:46:25
Subject: Re: [windev] Problem reading text files

"There ain't no such thing as plain text".

CStdioFile::ReadString() (I guess that's what you use. You don't tell us.
You don't show us any code) is based on fgets() which in turn is
character-oriented, which means it expects text in one given codepage.

If you don't know the codepage and have to preserve encoding and all you
want is to look for specific HTML tags and modify them, your only chance is
to use byte buffers instead of strings.
Use CFile::Read() to fill a binary buffer and cast bytes to chars: The
substrings you're looking are 7-bit ASCII. You'll find them easily and
replacing them won't be difficult. A bit more coding for a lot less
problems.
Anyway, forget about strings: Since you don't know how they are encoded,
you'll never know if you broke them. =


Serge.
http://www.apptranslator.com



> -----Original Message-----
> From: windev-bounces at windev.org [mailto:windev-bounces at windev.org] On
> Behalf Of truckleaj-windev at yahoo.co.uk
> Sent: samedi 28 mars 2009 16:06
> To: windev at windev.org
> Subject: [windev] Problem reading text files
> =

> Hello
> =

> I am using CStdioFile to read a text file, line by line, doing some
> processing on each line, and then adding the resulting CString into a
> new CStdioFile.
> =

> In all respects it is very basic coding.=A0 The problem lies in the fact
> that if the file is unicode and not ANSI, that the CStdioFile reads in
> the byteorder mark. This ends up as gibberish in my resulting file
> which is a HTML file.
> =

> The file being merged into the HTML is essentially CSS content
> information. However it is possible that the usre might put some
> language specific detail in there in one or two of the javascript
> methods. So they *might* end up changing the file to unicode. In which
> case I'll get the same problem again.
> =

> What can I do? Ideally I want to be able to use CStsioFile with any of
> these files and read in all the lines a CString objects. My application
> is unicode so it should be able to deal with it.
> =

> Please advise. Thanks.
> =

> Andrew
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL:
> http://lists.windev.org/pipermail/windev/attachments/20090328/0487f52e/
> attachment.htm
> --
> Windev mailing list at Windev at windev.org
> =

> Lost your password?=A0 Need to unsubscribe or change your delivery
> options?
> Go to http://lists.windev.org/mailman/listinfo/windev
> --
> Search the Windev Archives - www.windev.org

-- =

Windev mailing list at Windev at windev.org

Lost your password?=A0 Need to unsubscribe or change your delivery options?=
=A0 =

Go to http://lists.windev.org/mailman/listinfo/windev
--
Search the Windev Archives - www.windev.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.windev.org/pipermail/windev/attachments/20090402/67509c85=
/attachment.htm


More information about the Windev mailing list