Now I'm learning richedit control programming, write my own editor. Through learning Iczelion's win32asm tutor, I can read ANSI text file to richedit control, but in this way reading Unicode text, I got unreadable character, I know I must change the code compatible with Unicode text, can someone give me a snippet code that demostrate how to read Unicode text in richedit control.

Thx!

dREAMtHEATER
Posted on 2002-03-29 05:23:36 by dREAMtHEATER
Just a friendly reminder that this board has a very good search feature...

IE) I found an old post almost exactly worded the same as your title..

How to work in richedit using Unicode?

This has all the API's you would need to get the conversions done.

Enjoy..
:alright:
NaN
Posted on 2002-03-29 13:12:09 by NaN
Hi NaN:
Before I posted this thread, I'v searched the whole forum for Unicode subject, and read carefully the post you mentioned, but I still have no idea how to code, can u give me a sample code to demostrate it?
Posted on 2002-03-29 21:48:08 by dREAMtHEATER
Sure, i will try, but first cut more clearly what you need to happen for me:

A) Want normal Text to be converted to Unicode and then placed to the REdit?

B) Want Unicode text to be converted to normal text for REdit?

C) Normal Text shown as Unicode in REdit ( Dunno why? )

D) something else? :)

Im not too sure exactly which API you want me to help you out with (essentially, i know one of em will help you :) )


PS: Thanx for using the search!

:alright:
NaN
Posted on 2002-03-30 02:52:05 by NaN
Hi! NaN:
Thanks for ur help. if u write some code I wish got the way just like notepad do under 2000/XP, when user select a text file, u should give him a chance which text coding type he wanna use, just deal with GetOpenFilename API, you should add option that user can select the text coding type, such as ANSI, Unicode, UTF-8, I don't how to add this option in the OPENFILENAME structure. when user save file, he can also do same thing.
According to user selection, u should

A) read file as normal Text when it's ANSI format
B) show Unicode text as normal text for REdit when it's Unicode format(it's the core question, I don't know how to do it, maybe Want Unicode text to be converted to normal text for REdit?

I will wait ur good news.
Sorry for my bad English!

dREAMtHEATER


Sure, i will try, but first cut more clearly what you need to happen for me:

A) Want normal Text to be converted to Unicode and then placed to the REdit?

B) Want Unicode text to be converted to normal text for REdit?

C) Normal Text shown as Unicode in REdit ( Dunno why? )

D) something else? :)

Im not too sure exactly which API you want me to help you out with (essentially, i know one of em will help you :) )


PS: Thanx for using the search!

:alright:
NaN
Posted on 2002-03-30 05:28:55 by dREAMtHEATER
Ok i will be honest, and say I have used unicode strings in the past, but not to the extent of detecting them from a file...

I briefly looked throught the OPENFILENAME struct and dont see anything that would get you an indication of whether the chosen file is ASCII .txt or Unicode .txt ??? So this i will leave to you to decided when its "time" to use the following conversion routine:

To convert a string of Unicode to standard ASCII use: WideCharToMultiByte . ( words -> bytes )

Ernie has made some simple little macro's to get this done simpler:
Ascii2Unicode 	MACRO pwszReturnBuf:REQ, pszSourceBuf:REQ, SizeOfSourceBuf:REQ

invoke MultiByteToWideChar, CP_ACP, 0, pszSourceBuf, -1, pwszReturnBuf, SizeOfSourceBuf
ENDM
;-------------------------------------------------------------------------------
Unicode2Ascii MACRO pszReturnBuf:REQ, pwszSourceBuf:REQ, SizeOfSourceBuf:REQ
invoke WideCharToMultiByte, CP_ACP, 0, pwszSourceBuf, -1, pszReturnBuf, SizeOfSourceBuf, NULL, NULL
ENDM


    So the steps i would do in your case is:
    [*] - Get the file size that will be loaded ( nFileSize )
    [*] - load the file (probabaly thru file mapping - get a pointer pSrcData)
    [*] - Decided if you need to convert Unicode->Bytes (uncertian at this point how)
    [*] - Have some global memory for a buffer to decoded into. Can make a swap file in a temp Dir, can allocate a large piece of global memory, or use RichEdit's buffer (all should work).
    [*] - Once you have a buffer to work with, use "Unicode2Ascii pSrcData, pGlobalData, nFileSize" for full translation at once. Or for partial translations, have the source adjust to point to the beginning, and alter the nFilesize to amount to the # of bytes needed to be translated.. (this would typically be the 'viewing' range of text in the Rich edit).


    I hope this is some help to you, i didnt what to get into code directly as i pretty well hate using RichEdit and try to avoid it myself. Lemme know if you still need help at something, and if you want me to 'code' a bit, post or email mesome source to start with (save me the headaches of RichEdit ;P )

    Best of luck.
    :alright:
    NaN
Posted on 2002-03-30 12:13:19 by NaN
After re-reading your reply I starting to realize you *do* need help with the detecton of when its time...

I dont have a NT system, so im not familiar with the Notpad your talking about, but my only sugestion is to use the open file name dialog box, and provide a custom dialog template, and callback routine to provide a checkbox to the user to 'open as Unicode' or something. At least then you would know its intended as unicode. But this is a bit of work on its own to do... ??

This is just a sugestion/idea. I dont know how well (if at all possible) it will be. Most of the time you'll get headers with the file being opended (in the file) and can decided what to do with it by this, or by its extension type. But for txt files there is neither, so i can see the delema here...

Perhaps someone else has some usefull sugestions here?

:NaN:
Posted on 2002-03-30 12:20:24 by NaN
well once you select the file you want you dont have to process the info into the richedit right away. all you have to do is have a window popup on whether you would like to have it processed as ANSI or Unicode format. then once the user has made his/her selection you then process an ansi or unicode procedure depending on the choice.
Posted on 2002-03-30 12:58:22 by smurf
Hi! NaN:
With ur suggest, I change my code when reading file to richedit control, I use standard technique that send message to richedit with EM_STREAMIN, so I must code EditStreamCallback

EditStreamCallback proc IsOpen:DWORD, pBuffer:DWORD, NumBytes:DWORD, pBytesTransferred:DWORD
LOCAL hGlobal, pSrcData:DWORD
.IF IsOpen == TRUE
invoke GlobalAlloc, GMEM_ZEROINIT or GMEM_FIXED, NumBytes
mov hGlobal, eax
invoke GlobalLock, hGlobal
mov pSrcData, eax
invoke ReadFile, hFile, pSrcData, NumBytes, pBytesTransferred, 0
Unicode2Ascii pBuffer, pSrcData, NumBytes
invoke GlobalUnlock, hGlobal
invoke GlobalFree, hGlobal
.ELSE
invoke WriteFile, hFile, pBuffer, NumBytes, pBytesTransferred, 0
.ENDIF
xor eax,1
ret
EditStreamCallback endp


I just change openfile function, but it doesn't work, before I change, it run happily, NaA, can u figure out what's wrong in my snippet code, thanx for ur help!
Posted on 2002-03-30 14:00:41 by dREAMtHEATER
I dont have an RichEdit example on hand to test this with, but try this out:
EditStreamCallback proc IsOpen:DWORD, pBuffer:DWORD, NumBytes:DWORD, pBytesTransferred:DWORD 


LOCAL hGlobal, pSrcData:DWORD
LOCAL ErrorFlag :DWORD

mov ErrorFlag, FLASE

.IF IsOpen == TRUE
invoke GlobalAlloc, GMEM_ZEROINIT or GMEM_FIXED, NumBytes
mov hGlobal, eax
invoke GlobalLock, hGlobal
mov pSrcData, eax
invoke ReadFile, hFile, pSrcData, NumBytes, pBytesTransferred, 0
.if( eax == 0)
mov ErrorFlag, TRUE
jmp @F
.endif
mov edx, pBytesTransferred
mov eax, [edx]
Unicode2Ascii pBuffer, pSrcData, eax ; Actual number loaded
invoke GlobalUnlock, hGlobal
invoke GlobalFree, hGlobal ; If the function succeeds, the return value is NULL.
.ELSE
invoke WriteFile, hFile, pBuffer, NumBytes, pBytesTransferred, 0
; If the function succeeds, the return value is TRUE.
.if ( eax == 0 )
mov ErrorFlag, TRUE
.endif
.ENDIF
@@:
;The callback function returns zero to indicate success
; If no errors, EAX == FLASE (0)
; If error , EAX == TRUE (1)
mov eax, ErrorFlag
ret
EditStreamCallback endp


There was two things i noted in you callback:

1) your return indicating the callback success was flawed. XOR EAX,1 would return 1 if the Read File operations finished up properly. This is because the last API (GlobalFree) would set eax == 0, and then it gets toggled to 1 before exiting ( indicating a false error ). However, the else condition would work properly as a sucessful write will return TRUE and then get toggled to ZERO before exiting.

I scrapped this system and suggest a simple flag system to return properly (as shown).

2) The number of bytes being 'suggested' to the Unicode->Ascii may not be the correct amount. There is no garentee that it will read all of what was asked for, and thus you should use the number of bytes you actually have.

I made a simple fix here.


Last area of Caution. Reading up on the way EM_STREAMIN sounds like it works. The callback will be called continously untill an error occours (saying some unsuccess in the callback proceedure).

There is no "real" handling for this in the above example, as even an EOF will return a successful *read* repeatedly. Its possible your callback will never stop getting called. But i doubt this will actually happed. This is because one of the conditions of stopping the callback loop is having the returned number of bytes read == 0. This *should* happen on the second EOF read atempt (where EOF was detrmined in mid load of the previous run). Since this has been properly set in the callback, it should exit when finished based on this value.

Anywho, there is a surface analysis for you...
Hope it helps..
:alright:
NaN
Posted on 2002-03-30 23:50:38 by NaN
Hi NaN:
Today I look through the EM_STREAMIN message explained in the MSDN, and found new knowledge, maybe I am stupid, the richedit control can deal with Unicode, so a lot of code we discussed can omit, before I call EM_STREAMIN, I must check the file I wanna open whether include Unicode text, if include, the first two byte is *FFFE*(UCS2 little endian) or *FEFF*(UCS2 big endian), then set wParam of EM_STREAMIN thu including SF_UNICODE, other housekeeping chores will be done by richedit control .

below is the snippet code:

OpenFiles proc

LOCAL editStream:EDITSTREAM
LOCAL buffer:DWORD
LOCAL BytesRead:DWORD
LOCAL UniTest:DWORD
invoke CreateFile,OFFSET PathNameBuff, GENERIC_READ , FILE_SHARE_READ ,\
NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL
.IF eax == INVALID_HANDLE_VALUE
dsText szCreateFile, "CreateFile"
invoke HandleError, ADDR szCreateFile
mov eax, FALSE
.ELSE
mov hFile, eax
mov buffer, 0
invoke ReadFile, hFile, addr buffer, 2, addr BytesRead, NULL

mov UniTest, IS_TEXT_UNICODE_SIGNATURE or IS_TEXT_UNICODE_REVERSE_MASK
invoke IsTextUnicode, addr buffer, 4, addr UniTest
.if eax == 0
invoke SetFilePointer, hFile, NULL, NULL, FILE_BEGIN
mov ecx, SF_TEXT
.else
mov ecx, SF_TEXT or SF_UNICODE
.endif
;
; stream the text into the richedit control
;
mov editStream.dwCookie, Open
mov editStream.pfnCallback, OFFSET EditStreamCallback
invoke SendMessage, hEdit, EM_STREAMIN, ecx, ADDR editStream

invoke CloseHandle, hFile
invoke SendMessage, hEdit, EM_SETMODIFY, FALSE, 0
.ENDIF
ret
OpenFiles endp

[code] EditStreamCallback proc IsOpen:DWORD, pBuffer:DWORD, NumBytes:DWORD, pBytesTransferred:DWORD
LOCAL ErrorFlag :DWORD
mov ErrorFlag, FALSE
.IF IsOpen == TRUE
invoke ReadFile, hFile, pBuffer, NumBytes, pBytesTransferred, 0
.if eax == 0
mov ErrorFlag, TRUE
.endif
.ELSE
invoke WriteFile, hFile, pBuffer, NumBytes, pBytesTransferred, 0
.if eax == 0
mov ErrorFlag, TRUE
.endif
.ENDIF
;The callback function returns zero to indicate success
; If no errors, EAX == FLASE (0)
; If error , EAX == TRUE (1)
mov eax, ErrorFlag
ret
EditStreamCallback endp[/code]

But I found a limitation to richedit control, it can only deal with Unicode little endian, if the text file is Unicode big endian, it will do wrong thing, so U must do it urself, I don't konw how to do it.

U r a warmhearted man, I greatly appreciate your timely help.

dREAMtHEATER
Posted on 2002-03-31 11:19:19 by dREAMtHEATER
Hey you figured it out, i just gave you a push :)

Glad its now working...
:alright:
NaN
Posted on 2002-03-31 12:37:20 by NaN
Hi NaN:
now I resolved the bug when Unicode text that is detected as big endian but richedit control can not deal with it. I add code to EditStreamCallback when program readfile, if the Unicode text is
big endian format, I reverse every doublebye to the format of little endian, it will run happily.
In the attachment, I post the text editor that code myself with richedit control, hope u can take some time to test it, thanx.

	.IF IsOpen == TRUE 

invoke ReadFile, hFile, pBuffer, NumBytes, pBytesTransferred, 0
.if eax == 0
mov ErrorFlag, TRUE
.endif
mov eax, UniTest
test eax, IS_TEXT_UNICODE_REVERSE_SIGNATURE
.IF !ZERO?
mov eax, NumBytes
sar eax, 1
mov ecx, pBuffer
mov edi, eax
@@:
mov dl, byte ptr [ecx+01]
mov al, byte ptr [ecx]
mov byte ptr [ecx], dl
mov byte ptr [ecx+01], al
add ecx, 00000002
dec edi
jne @B
.endif
.ELSE
invoke WriteFile, hFile, pBuffer, NumBytes, pBytesTransferred, 0
.if eax == 0
mov ErrorFlag, TRUE
.endif
.ENDIF
;The callback function returns zero to indicate success
; If no errors, EAX == FLASE (0)
; If error , EAX == TRUE (1)
mov eax, ErrorFlag
ret
EditStreamCallback endp


dREAMtHEATER
Posted on 2002-04-01 00:08:16 by dREAMtHEATER
Im a bit lost... Did you get it all figured out? Cause when i try and load an asm file i get a stream of oriental symbols on one line?? :confused:

:NaN:
Posted on 2002-04-01 03:37:07 by NaN
Hi NaN
U must set wordwarp on thu checking menu-->edit-->wordwarp.
Thanx for ur testing, I hope u give me a deep test.

dREAMtHEATER


Im a bit lost... Did you get it all figured out? Cause when i try and load an asm file i get a stream of oriental symbols on one line?? :confused:

:NaN:
Posted on 2002-04-01 09:38:39 by dREAMtHEATER
I think your missing the point here... New Times Roman looks like this:
Posted on 2002-04-01 13:14:06 by NaN
Maybe the program only works with Unicode files (16-bit characters). Your normal ASM file is ANSI/ASCII, not Unicode.
Posted on 2002-04-01 16:50:13 by tenkey
Hi NaN:
Can u send me or attach the file that generate error in my Minipad? So I can figure it out. and can u tell me the language version of ur Windows? Thanx!

dREAMtHEATER


I think your missing the point here... New Times Roman looks like this:
Posted on 2002-04-02 00:36:11 by dREAMtHEATER
The file is in the title line of the picture and is part of the masm package produced by hutch (i assume you already have it).

My language is U.S. English, on Win98 SE.

Hope it helps.
:alright:
NaN
Posted on 2002-04-02 00:57:38 by NaN
Hi NaN:
MASM in my hand is v7.0, u'd better exactly tell me which file in this tool kit, Thanx!

dREAMtHEATER


The file is in the title line of the picture and is part of the masm package produced by hutch (i assume you already have it).

My language is U.S. English, on Win98 SE.

Hope it helps.
:alright:
NaN
Posted on 2002-04-02 01:03:05 by dREAMtHEATER