RichEditBox change proposal, October 2015

HMG Unicode versions 3.1.x related

Moderator: Rathinagiri

Post Reply
User avatar
kcarmody
Posts: 152
Joined: Tue Oct 07, 2014 11:13 am
Contact:

RichEditBox change proposal, October 2015

Post by kcarmody »

As I promised a few days ago (http://hmgforum.com/viewtopic.php?f=43&t=4493&start=4), I am now submitting a proposal to add features to the Rich Edit Box control.

I submitted similar proposals in November 2014 for HMG 3.3.1 (http://hmgforum.com/viewtopic.php?f=43&t=4071&start=11), and in September 2015 for HMG 3.4.1 (http://hmgforum.com/viewtopic.php?f=43&t=4471&start=38), but each time only a small portion of the changes I proposed were put into the next version. This time I am explaining more, so that it will be easier to understand how the changes I am proposing fit together.

These changes are a patch that should be installed on top of version 3.4.2. Individual modified source files are at http://kevincarmody.com/hmg/, and a zip of files in the patch is at http://kevincarmody.com/hmg/HmgChangeProposal.zip.

This patch includes an overhauled Rich Edit demo, which uses all the source code changes, except for the HasNonAnsiChars property and the SelPasteSpecial method. The new Rich Edit demo is at http://kevincarmody.com/hmg/SAMPLES/Con ... chEditBox/, including the executable at http://kevincarmody.com/hmg/SAMPLES/Con ... x/demo.exe.
New rich edit control property HASNONASCIICHARS (read only) Detects whether a rich edit control contains non-ASCII Unicode characters.

http://kevincarmody.com/hmg/INCLUDE/i_window.ch - line 175

Code: Select all

;; /* 
      Following line modified by Kevin Carmody, October 2015

      It adds the HasNonAsciiChars and HasNonAnsiChars properties to the rich edit box control.
      HasNonAsciiChars detects whether a rich edit control contains non-ASCII Unicode characters.
      HasNonAnsiChars detects whether a rich edit control contains non-ANSI Unicode characters.

      See 
        _RichEditBox_GetProperty() in SOURCE\h_controlmisc.prg
        RichEditBox_HasNonAsciiChars() and RichEditBox_HasNonAnsiChars() in SOURCE\h_richeditbox.prg
        HMG_IsNonASCII() and HMG_UTF8IsNonANSI() in SOURCE\h_UNICODE_String.prg
   */ ;;
#xtranslate <w>. \<c\> . \<p:RTFTextMode,AutoURLDetect,Zoom,SelectRange,CaretPos,Value,GetSelectText,GetTextLength,ViewRect,HasNonAsciiChars,HasNonAnsiChars\> => GetProperty ( <"w">, \<"c"\> , \<"p"\> ) ;;
http://kevincarmody.com/hmg/SOURCE/h_controlmisc.prg - lines 10296-10298

Code: Select all

   /*
      Following two cases added by Kevin Carmody, October 2015

      They adds the HasNonAsciiChars and HasNonAnsiChars properties to the rich edit box control.
      HasNonAsciiChars detects whether a rich edit control contains non-ASCII Unicode characters.
      HasNonAnsiChars detects whether a rich edit control contains non-ANSI Unicode characters.

      See 
        HasNonAsciiChars and HasNonAnsiChars translations in SOURCE\i_window.ch
        RichEditBox_HasNonAsciiChars() and RichEditBox_HasNonAnsiChars() in SOURCE\h_richeditbox.prg
        HMG_IsNonASCII() and HMG_UTF8IsNonANSI() in SOURCE\h_UNICODE_String.prg
   */ 
   CASE Arg3 == "HASNONASCIICHARS"
        xData  := RichEditBox_HasNonAsciiChars ( hWndControl )
        RetVal := .T.
   
   CASE Arg3 == "HASNONANSICHARS"
        xData  := RichEditBox_HasNonAnsiChars ( hWndControl )
        RetVal := .T.
HasNonAsciiChars calls RICHEDITBOX_HASNONASCIICHARS(), a new function.

http://kevincarmody.com/hmg/SOURCE/h_richeditbox.prg - lines 532-538

Code: Select all

/*
    Following function added by Kevin Carmody, October 2015

    This function tests for the presence of non-ASCII characters in a rich 
    edit control.  For efficiency, it does not distinguish between non-ASCII 
    ANSI and non-ASCII Unicode.

    See 
      HasNonAsciiChars translation in SOURCE\i_window.ch
      _RichEditBox_GetProperty() in SOURCE\h_controlmisc.prg
      HMG_IsNonASCII() in SOURCE\h_UNICODE_String.prg
*/

*-----------------------------------------------------------------------------*
FUNCTION RichEditBox_HasNonAsciiChars( hWndControl )
*-----------------------------------------------------------------------------*

LOCAL cBuffer := RichEditBox_GetText( hWndControl, .N. )

RETURN HMG_IsNonASCII( cBuffer, .N. )
RichEditBox_HasNonAsciiChars() calls HMG_ISNONASCII(), a new function.

Determines whether a string contains any non-ASCII characters.

http://kevincarmody.com/hmg/SOURCE/h_UNICODE_String.prg - lines 329-343

Code: Select all

/*
    Following function added by Kevin Carmody, October 2015

    Determines whether a string contains any non-ASCII characters.

    See 
      HasNonAsciiChars translation in SOURCE\i_window.ch
      _RichEditBox_GetProperty() in SOURCE\h_controlmisc.prg
      RichEditBox_HasNonAsciiChars() in SOURCE\h_richeditbox.prg
*/

FUNCTION HMG_IsNonASCII( cString )

LOCAL lNonASCII := .F.
LOCAL cChar

   BEGIN SEQUENCE
      FOR EACH cChar IN cString
         IF cChar >= CHR( 0x80 )
            lNonASCII := .T.
            BREAK
         ENDIF
      NEXT
   END SEQUENCE

RETURN lNonASCII
New rich edit control property HASNONANSICHARS (read only) Detects whether a rich edit control contains non-ANSI Unicode characters.

http://kevincarmody.com/hmg/INCLUDE/i_window.ch - line 175

Code: Select all

;; /* 
      Following line modified by Kevin Carmody, October 2015

      It adds the HasNonAsciiChars and HasNonAnsiChars properties to the rich edit box control.
      HasNonAsciiChars detects whether a rich edit control contains non-ASCII Unicode characters.
      HasNonAnsiChars detects whether a rich edit control contains non-ANSI Unicode characters.

      See 
        _RichEditBox_GetProperty() in SOURCE\h_controlmisc.prg
        RichEditBox_HasNonAsciiChars() and RichEditBox_HasNonAnsiChars() in SOURCE\h_richeditbox.prg
        HMG_IsNonASCII() and HMG_UTF8IsNonANSI() in SOURCE\h_UNICODE_String.prg
   */ ;;
#xtranslate <w>. \<c\> . \<p:RTFTextMode,AutoURLDetect,Zoom,SelectRange,CaretPos,Value,GetSelectText,GetTextLength,ViewRect,HasNonAsciiChars,HasNonAnsiChars\> => GetProperty ( <"w">, \<"c"\> , \<"p"\> ) ;;
http://kevincarmody.com/hmg/SOURCE/h_controlmisc.prg - lines 10300-10302

Code: Select all

   /*
      Following two cases added by Kevin Carmody, October 2015

      They adds the HasNonAsciiChars and HasNonAnsiChars properties to the rich edit box control.
      HasNonAsciiChars detects whether a rich edit control contains non-ASCII Unicode characters.
      HasNonAnsiChars detects whether a rich edit control contains non-ANSI Unicode characters.

      See 
        HasNonAsciiChars and HasNonAnsiChars translations in SOURCE\i_window.ch
        RichEditBox_HasNonAsciiChars() and RichEditBox_HasNonAnsiChars() in SOURCE\h_richeditbox.prg
        HMG_IsNonASCII() and HMG_UTF8IsNonANSI() in SOURCE\h_UNICODE_String.prg
   */ 
   CASE Arg3 == "HASNONASCIICHARS"
        xData  := RichEditBox_HasNonAsciiChars ( hWndControl )
        RetVal := .T.
   
   CASE Arg3 == "HASNONANSICHARS"
        xData  := RichEditBox_HasNonAnsiChars ( hWndControl )
        RetVal := .T.
HasNonAnsiChars calls RICHEDITBOX_HASNONANSICHARS(), a new function.

http://kevincarmody.com/hmg/SOURCE/h_richeditbox.prg - lines 555-561

Code: Select all

/*
    Following function added by Kevin Carmody, October 2015

    This function tests for the presence of non-ANSI characters in a rich 
    edit control.  It is slower than RichEditBox_HasNonAsciiChars but does 
    not reject any Unicode characters that are in ANSI.

    See 
      HasNonAnsiChars translation in SOURCE\i_window.ch
      _RichEditBox_GetProperty() in SOURCE\h_controlmisc.prg
      HMG_IsNonANSI() in SOURCE\h_UNICODE_String.prg
*/

*-----------------------------------------------------------------------------*
FUNCTION RichEditBox_HasNonAnsiChars( hWndControl )
*-----------------------------------------------------------------------------*

LOCAL cBuffer := RichEditBox_GetText( hWndControl, .N. )

RETURN HMG_UTF8IsNonANSI( cBuffer )
RichEditBox_HasNonAnsiChars() calls HMG_UTF8ISNONANSI(), a new function.

Determines whether a UTF-8 string contains any non-ANSI characters.

http://kevincarmody.com/hmg/SOURCE/h_UNICODE_String.prg - lines 358-348

Code: Select all

/*
    Following function added by Kevin Carmody, October 2015

    Determines whether a UTF-8 string contains any non-ANSI characters.
    It does not check whether the string is valid UTF-8.

    See 
      HasNonAnsiChars translation in SOURCE\i_window.ch
      _RichEditBox_GetProperty() in SOURCE\h_controlmisc.prg
      RichEditBox_HasNonAnsiChars() in SOURCE\h_richeditbox.prg
*/

FUNCTION HMG_UTF8IsNonANSI( cUtf8Str )

LOCAL aAnsiTrans := { ;
   0x20AC, ; // ANSI 0x80 - EURO SIGN
   0x201A, ; // ANSI 0x82 - SINGLE LOW-9 QUOTATION MARK
   0x0192, ; // ANSI 0x83 - LATIN SMALL LETTER F WITH HOOK
   0x201E, ; // ANSI 0x84 - DOUBLE LOW-9 QUOTATION MARK
   0x2026, ; // ANSI 0x85 - HORIZONTAL ELLIPSIS
   0x2020, ; // ANSI 0x86 - DAGGER
   0x2021, ; // ANSI 0x87 - DOUBLE DAGGER
   0x02C6, ; // ANSI 0x88 - MODIFIER LETTER CIRCUMFLEX ACCENT
   0x2030, ; // ANSI 0x89 - PER MILLE SIGN
   0x0160, ; // ANSI 0x8A - LATIN CAPITAL LETTER S WITH CARON
   0x2039, ; // ANSI 0x8B - SINGLE LEFT-POINTING ANGLE QUOTATION MARK
   0x0152, ; // ANSI 0x8C - LATIN CAPITAL LIGATURE OE
   0x017D, ; // ANSI 0x8E - LATIN CAPITAL LETTER Z WITH CARON
   0x2018, ; // ANSI 0x91 - LEFT SINGLE QUOTATION MARK
   0x2019, ; // ANSI 0x92 - RIGHT SINGLE QUOTATION MARK
   0x201C, ; // ANSI 0x93 - LEFT DOUBLE QUOTATION MARK
   0x201D, ; // ANSI 0x94 - RIGHT DOUBLE QUOTATION MARK
   0x2022, ; // ANSI 0x95 - BULLET
   0x2013, ; // ANSI 0x96 - EN DASH
   0x2014, ; // ANSI 0x97 - EM DASH
   0x02DC, ; // ANSI 0x98 - SMALL TILDE
   0x2122, ; // ANSI 0x99 - TRADE MARK SIGN
   0x0161, ; // ANSI 0x9A - LATIN SMALL LETTER S WITH CARON
   0x203A, ; // ANSI 0x9B - SINGLE RIGHT-POINTING ANGLE QUOTATION MARK
   0x0153, ; // ANSI 0x9C - LATIN SMALL LIGATURE OE
   0x017E, ; // ANSI 0x9E - LATIN SMALL LETTER Z WITH CARON
   0x0178  } // ANSI 0x9F - LATIN CAPITAL LETTER Y WITH DIAERESIS
LOCAL aAnsiSkip  := { ;
   0x81, ;
   0x8D, ;
   0x8F, ;
   0x90, ;
   0x9D  }

LOCAL lNonANSI := .F.
LOCAL nOctets  := 0
LOCAL cChar, nChar, nCode

   BEGIN SEQUENCE

      FOR EACH cChar IN cUtf8Str

         nChar := HB_BCODE( cChar )

         IF nOctets != 0

            --nOctets
            nCode := HB_BITOR( HB_BITSHIFT( nCode, 6 ), HB_BITAND( nChar, 0x3F ) )
            IF nOctets == 0
               DO CASE
               CASE nCode >= 0x100
                  IF ASCAN( aAnsiTrans, nCode ) == 0
                     lNonANSI := .T.
                  ENDIF
               CASE nCode >= 0xA0
               CASE nCode >= 0x80
                  IF ASCAN( aAnsiSkip, nCode ) == 0
                     lNonANSI := .T.
                  ENDIF                  
               ENDCASE
            ENDIF

         ELSEIF HB_BITAND( nChar, 0x80 ) != 0

            DO WHILE HB_BITAND( nChar, 0x80 ) != 0
               nChar := HB_BITAND( HB_BITSHIFT ( nChar, 1 ), 0xFF )
               ++nOctets
            ENDDO
            --nOctets
            nCode := HB_BITAND( HB_BCODE( cChar ), HB_BITSHIFT( 1, nOctets ) - 1 )

         ENDIF

      NEXT

   END SEQUENCE

RETURN lNonANSI
New rich edit control method LOADFILE() Synonym for RtfLoadFile, which has been enhanced. See RtfLoadFile below.

http://kevincarmody.com/hmg/INCLUDE/i_window.ch - lines 195-196, 203

Code: Select all

;; /*
      Following 2 lines modified by Kevin Carmody, October 2015

      They add the LoadFile and SaveFile methods to the rich edit box control
      as synonyms for the RTFLoadFile and RTFSaveFile methods.

      See 
        _RichEditBox_DoMethod() in SOURCE\h_controlmisc.prg
        RichEditBox_LoadFile() and RichEditBox_SaveFile() in SOURCE\h_richeditbox.prg
        RichEditBox_StreamIn() and RichEditBox_StreamOut() in SOURCE\c_richeditbox.c
        HMG_UTF16ByteSwap() in SOURCE\h_UNICODE_String.prg
   */ ;;
#xtranslate <w>. \<c\> . \<p:RTFLoadFile,RTFSaveFile,LoadFile,SaveFile\> (\<arg1\>,\<arg2\>,\<arg3\>) => DoMethod ( <"w">, \<"c"\> , \<"p"\> , \<arg1\> , \<arg2\>, \<arg3\> ) ;;
#xtranslate <w>. \<c\> . \<p:RTFLoadFile,RTFSaveFile,LoadFile,SaveFile\> (\<arg1\>,\<arg2\>) => DoMethod ( <"w">, \<"c"\> , \<"p"\> , \<arg1\> , \<arg2\> ) ;;
;; /*
      Following line added by Kevin Carmody, October 2015

      It allows the RTFLoadFile, RTFSaveFile, LoadFile, and SaveFile methods to be called with one argument,
      the file name.  The second argument, lSelection (RichEditBox_LoadFile() in h_richeditbox.prg), defaults to .F.
   */ ;;
#xtranslate <w>. \<c\> . \<p:RTFLoadFile,RTFSaveFile,LoadFile,SaveFile\> (\<arg1\>) => DoMethod ( <"w">, \<"c"\> , \<"p"\> , \<arg1\> ) ;;
http://kevincarmody.com/hmg/SOURCE/h_controlmisc.prg - lines 10492-10494

Code: Select all

   /*
      Following two cases modified by Kevin Carmody, October 2015

      These cases use the xData argument defined above to return a value from 
      the Rich Edit methods RTFLoadFile, LoadFile, RTFSaveFile, and SaveFile.
   */ 
   CASE Arg3 == HMG_UPPER ("RTFLoadFile") .OR. Arg3 == HMG_UPPER ("LoadFile")
        xData := RichEditBox_LoadFile ( hWndControl, Arg4, Arg5, Arg6 )   // by default save in SF_RTF format
        RetVal := .T.

   CASE Arg3 == HMG_UPPER ("RTFSaveFile") .OR. Arg3 == HMG_UPPER ("SaveFile")
        xData := RichEditBox_SaveFile ( hWndControl, Arg4, Arg5, Arg6 )   // by default save in SF_RTF format
        RetVal := .T.
New rich edit control method SAVEFILE() Synonym for RtfSaveFile, which has been enhanced. See RtfSaveFile below.

http://kevincarmody.com/hmg/INCLUDE/i_window.ch - lines 195-196, 203

Code: Select all

;; /*
      Following 2 lines modified by Kevin Carmody, October 2015

      They add the LoadFile and SaveFile methods to the rich edit box control
      as synonyms for the RTFLoadFile and RTFSaveFile methods.

      See 
        _RichEditBox_DoMethod() in SOURCE\h_controlmisc.prg
        RichEditBox_LoadFile() and RichEditBox_SaveFile() in SOURCE\h_richeditbox.prg
        RichEditBox_StreamIn() and RichEditBox_StreamOut() in SOURCE\c_richeditbox.c
        HMG_UTF16ByteSwap() in SOURCE\h_UNICODE_String.prg
   */ ;;
#xtranslate <w>. \<c\> . \<p:RTFLoadFile,RTFSaveFile,LoadFile,SaveFile\> (\<arg1\>,\<arg2\>,\<arg3\>) => DoMethod ( <"w">, \<"c"\> , \<"p"\> , \<arg1\> , \<arg2\>, \<arg3\> ) ;;
#xtranslate <w>. \<c\> . \<p:RTFLoadFile,RTFSaveFile,LoadFile,SaveFile\> (\<arg1\>,\<arg2\>) => DoMethod ( <"w">, \<"c"\> , \<"p"\> , \<arg1\> , \<arg2\> ) ;;
;; /*
      Following line added by Kevin Carmody, October 2015

      It allows the RTFLoadFile, RTFSaveFile, LoadFile, and SaveFile methods to be called with one argument,
      the file name.  The second argument, lSelection (RichEditBox_LoadFile() in h_richeditbox.prg), defaults to .F.
   */ ;;
#xtranslate <w>. \<c\> . \<p:RTFLoadFile,RTFSaveFile,LoadFile,SaveFile\> (\<arg1\>) => DoMethod ( <"w">, \<"c"\> , \<"p"\> , \<arg1\> ) ;;
http://kevincarmody.com/hmg/SOURCE/h_controlmisc.prg - lines 10496-10498

Code: Select all

   /*
      Following two cases modified by Kevin Carmody, October 2015

      These cases use the xData argument defined above to return a value from 
      the Rich Edit methods RTFLoadFile, LoadFile, RTFSaveFile, and SaveFile.
   */ 
   CASE Arg3 == HMG_UPPER ("RTFLoadFile") .OR. Arg3 == HMG_UPPER ("LoadFile")
        xData := RichEditBox_LoadFile ( hWndControl, Arg4, Arg5, Arg6 )   // by default save in SF_RTF format
        RetVal := .T.

   CASE Arg3 == HMG_UPPER ("RTFSaveFile") .OR. Arg3 == HMG_UPPER ("SaveFile")
        xData := RichEditBox_SaveFile ( hWndControl, Arg4, Arg5, Arg6 )   // by default save in SF_RTF format
        RetVal := .T.
New rich edit control method SELPASTESPECIAL()
  • Pastes the clipboard into a rich edit box control using a specified format.
  • Clipboard formats CF_* are declared in <winuser.h> and documented in MSDN.
http://kevincarmody.com/hmg/INCLUDE/i_window.ch - line 216 - this translation is new

Code: Select all

;; /*
      Following line added by Kevin Carmody, October 2015

      It adds the SelPasteSpecial method to the rich edit box control.
      This method pastes the clipboard into a rich edit box control using a specified format.

      See
        _RichEditBox_DoMethod() in SOURCE\h_controlmisc.prg
        RichEditBox_PasteSpecial() in SOURCE\c_richeditbox.c
   */ ;;
#xtranslate <w>. \<c\> . \<p:SelPasteSpecial\> (\<arg1\>) => DoMethod ( <"w">, \<"c"\> , \<"p"\> , \<arg1\> ) ;;
http://kevincarmody.com/hmg/SOURCE/h_controlmisc.prg - lines 10500-10502 - this definition is already in HMG 3.4.2.

Code: Select all

   CASE Arg3 == HMG_UPPER ("SelPasteSpecial")
        RichEditBox_PasteSpecial ( hWndControl, Arg4 )
        RetVal := .T.
Enhanced rich edit control method RTFLOADFILE()
  • Skips over byte order marks in Unicode text files.
  • Supports UTF-16 BE text files (big endian Unicode text file).
  • The RTFUTF8 file type has been removed and the UTF-16 BE file type has been added. See below.
http://kevincarmody.com/hmg/INCLUDE/i_window.ch - lines 195-196, 203

Code: Select all

;; /*
      Following 2 lines modified by Kevin Carmody, October 2015

      They add the LoadFile and SaveFile methods to the rich edit box control
      as synonyms for the RTFLoadFile and RTFSaveFile methods.

      See 
        _RichEditBox_DoMethod() in SOURCE\h_controlmisc.prg
        RichEditBox_LoadFile() and RichEditBox_SaveFile() in SOURCE\h_richeditbox.prg
        RichEditBox_StreamIn() and RichEditBox_StreamOut() in SOURCE\c_richeditbox.c
        HMG_UTF16ByteSwap() in SOURCE\h_UNICODE_String.prg
   */ ;;
#xtranslate <w>. \<c\> . \<p:RTFLoadFile,RTFSaveFile,LoadFile,SaveFile\> (\<arg1\>,\<arg2\>,\<arg3\>) => DoMethod ( <"w">, \<"c"\> , \<"p"\> , \<arg1\> , \<arg2\>, \<arg3\> ) ;;
#xtranslate <w>. \<c\> . \<p:RTFLoadFile,RTFSaveFile,LoadFile,SaveFile\> (\<arg1\>,\<arg2\>) => DoMethod ( <"w">, \<"c"\> , \<"p"\> , \<arg1\> , \<arg2\> ) ;;
;; /*
      Following line added by Kevin Carmody, October 2015

      It allows the RTFLoadFile, RTFSaveFile, LoadFile, and SaveFile methods to be called with one argument,
      the file name.  The second argument, lSelection (RichEditBox_LoadFile() in h_richeditbox.prg), defaults to .F.
   */ ;;
#xtranslate <w>. \<c\> . \<p:RTFLoadFile,RTFSaveFile,LoadFile,SaveFile\> (\<arg1\>) => DoMethod ( <"w">, \<"c"\> , \<"p"\> , \<arg1\> ) ;;
http://kevincarmody.com/hmg/SOURCE/h_controlmisc.prg - lines 10492-10494

Code: Select all

   /*
      Following two cases modified by Kevin Carmody, October 2015

      These cases use the xData argument defined above to return a value from 
      the Rich Edit methods RTFLoadFile, LoadFile, RTFSaveFile, and SaveFile.
   */ 
   CASE Arg3 == HMG_UPPER ("RTFLoadFile") .OR. Arg3 == HMG_UPPER ("LoadFile")
        xData := RichEditBox_LoadFile ( hWndControl, Arg4, Arg5, Arg6 )   // by default save in SF_RTF format
        RetVal := .T.

   CASE Arg3 == HMG_UPPER ("RTFSaveFile") .OR. Arg3 == HMG_UPPER ("SaveFile")
        xData := RichEditBox_SaveFile ( hWndControl, Arg4, Arg5, Arg6 )   // by default save in SF_RTF format
        RetVal := .T.
File types

The file types that RtfLoadFile and RtfSaveFile accept have been changed. The type parameter now accepts one of following constants:
  • RICHEDITFILE_RTF - RTF file
  • RICHEDITFILE_TEXTANSI - ANSI text file
  • RICHEDITFILE_TEXTUTF8 - UTF-8 text file
  • RICHEDITFILE_TEXTUTF16LE - UTF-16 LE (little endian) text file
  • RICHEDITFILE_TEXTUTF16BE - UTF-16 BE (big endian) text file
  • RICHEDITFILE_TEXT has been renamed to RICHEDITFILE_TEXTANSI for clarity.
  • RICHEDITFILE_TEXTUTF16BE has been added because RtfLoadFile and RtfSaveFile now support UTF-16 BE text files. UTF-16 BE files are supported by Notepad and many other word processing applications.
  • RICHEDITFILE_TEXTUTF16 has been renamed to RICHEDITFILE_TEXTUTF16LE to clearly distinguish it from RICHEDITFILE_TEXTUTF16BE.
  • RICHEDITFILE_RTFUTF8 is unnecessary and has been removed. Although the Windows EM_STREAMIN and EM_STREAMOUT messages supports it, in practice this file type never occurs, and it is not supported by any standard word processing application that I have ever seen. In RTF files, non-ASCII characters are always written as ASCII escape sequences, e.g. \'e8 for è (CHR(0xE8)) and \u916? for Δ (Greek uppercase delta, U+0394, decimal 916). So UTF-8 on top of RTF is never needed.
http://kevincarmody.com/hmg/INCLUDE/i_richeditbox.ch - lines 196-200

Code: Select all

/* 
  Following 5 #defines modified by Kevin Carmody, October 2015

  These constant names have been modified.
    RICHEDITFILE_TEXT has been renamed to RICHEDITFILE_TEXTANSI for clarity.
    RICHEDITFILE_TEXTUTF16BE has been added because RtfLoadFile now supports 
      UTF-16 BE text files.  UTF-16 BE files are supported by Notepad and
      many other word processing applications.
    RICHEDITFILE_TEXTUTF16 has been renamed to RICHEDITFILE_TEXTUTF16LE to 
      clearly distinguish it from RICHEDITFILE_TEXTUTF16BE.
    RICHEDITFILE_RTFUTF8 is unnecessary and has been removed.  Although 
      the Windows EM_STREAMIN message supports it, in practice this file 
      type never occurs, and it is not supported by any standard word 
      processing application that I have ever seen.  In RTF files, non-ASCII 
      characters are always written as ASCII escape sequences, e.g. \'e8 for 
      è (CHR(0xE8)) and \u916? for Greek uppercase delta (U+0394, decimal 
      916).  So UTF-8 on top of RTF is never needed.

  These values are returned by GetRichEditFileType() and are used by the 
    LoadFile, RtfLoadFile, SaveFile, and RtfSaveFile rich edit box methods.

  See
    GetRichEditFileType() in SOURCE\h_richeditbox.prg
    _RichEditBox_DoMethod() in SOURCE\h_controlmisc.prg
    RichEditBox_LoadFile() and RichEditBox_SaveFile() in SOURCE\h_richeditbox.prg
    RichEditBox_StreamIn() and RichEditBox_StreamOut() in SOURCE\c_richeditbox.c
    HMG_UTF16ByteSwap() in SOURCE\h_UNICODE_String.prg
*/

*****************
*   File type   *
*****************

#define RICHEDITFILE_TEXTANSI      1   // ANSI text file
#define RICHEDITFILE_TEXTUTF8      2   // UTF-8 text file
#define RICHEDITFILE_TEXTUTF16LE   3   // UTF-16 LE (little endian) text file
#define RICHEDITFILE_RTF           4   // RTF file
#define RICHEDITFILE_TEXTUTF16BE   5   // UTF-16 BE (big endian) text file
The new function GetRichEditFileType() returns a file type which can be used for the file type. See GetRichEditFileType() below.

RtfLoadFile calls RICHEDITBOX_LOADFILE(), which has been enhanced.
  • Skips over byte order marks in Unicode text files.
  • Supports UTF-16 BE file type. This function supports UTF-16 BE text files by using HMG_UTF16ByteSwap() to first convert it a UTF-16 BE file to UTF-16 LE and then calling RichEditBox_StreamIn() on the UTF-16 LE file.
http://kevincarmody.com/hmg/SOURCE/h_richeditbox.prg - lines 436-467

Code: Select all

/*
    Following function modified by Kevin Carmody, October 2015

    Changes
      Skips over byte order marks in Unicode text files.
      Supports UTF-16 BE text files (big endian Unicode text file).

    This function supports UTF-16 BE text files by using HMG_UTF16ByteSwap() 
    to first convert it a UTF-16 BE file to UTF-16 LE and then calling 
    RichEditBox_StreamIn() on the UTF-16 LE file.

    See
      RICHEDITFILE_* constants in SOURCE\i_richeditbox.ch
      RTFLoadFile and LoadFile translations in SOURCE\i_window.ch
      _RichEditBox_DoMethod() in SOURCE\h_controlmisc.prg
      RichEditBox_StreamIn() in SOURCE\c_richeditbox.c
      HMG_UTF16ByteSwap() in SOURCE\h_UNICODE_String.prg
*/ 

*-----------------------------------------------------------------------------*
FUNCTION RichEditBox_LoadFile( hWndControl, cFile, lSelection, nType )
*-----------------------------------------------------------------------------*
LOCAL lSuccess := .F.
LOCAL cTempFile

   IF ValType( lSelection ) <> "L"
      lSelection := .F.
   ENDIF
   
   IF ValType( nType ) <> "N"
      nType := RICHEDITFILE_RTF
   ENDIF

   lSuccess := RichEditBox_RTFLoadResourceFile( hWndControl, cFile, lSelection )

   IF RichEditBox_RTFLoadResourceFile( hWndControl, cFile, lSelection )
      lSuccess := .T.
   ELSE
      IF nType == RICHEDITFILE_TEXTUTF16BE
         cTempFile := GETTEMPFOLDER() + "_RichEditLoadFile.txt"
         lSuccess  := HMG_UTF16ByteSwap( cFile, cTempFile )
         IF lSuccess
            lSuccess := RichEditBox_StreamIn( hWndControl, cTempFile, lSelection, RICHEDITFILE_TEXTUTF16LE )
         ENDIF
         DELETE FILE ( cTempFile )
      ELSE
         lSuccess := RichEditBox_StreamIn( hWndControl, cFile, lSelection, nType )
      ENDIF
   ENDIF

Return lSuccess
RichEditBox_LoadFile() calls HMG_UTF16BYTESWAP(), a new function.

Converts between UTF-16 LE and UTF-16 BE files.

http://kevincarmody.com/hmg/SOURCE/h_UNICODE_String.prg - lines 454-498

Code: Select all

/*
    Following function added by Kevin Carmody, October 2015

    Converts between UTF-16 LE and UTF-16 BE files.

    See 
      RichEditBox_LoadFile() and RichEditBox_SaveFile() in SOURCE\h_richeditbox.prg
      RICHEDITFILE_* constants in SOURCE\i_richeditbox.ch
      RichEditBox_StreamIn() and RichEditBox_StreamOut() in SOURCE\c_richeditbox.c
      _RichEditBox_DoMethod() in SOURCE\h_controlmisc.prg
      RTFLoadFile, LoadFile, RTFSaveFile, SaveFile translations in SOURCE\i_window.ch
*/

*-----------------------------------------------------------------------------*
FUNCTION HMG_UTF16ByteSwap( cInFile, cOutFile )
*-----------------------------------------------------------------------------*

LOCAL hInFile   := FOPEN( cInFile , FO_READ )
LOCAL hOutFile  := FCREATE( cOutFile )
LOCAL cInBuffer := SPACE( 0x400 )
LOCAL nBufRead  := 1
LOCAL lSuccess  := .N.
LOCAL cOutBuffer, cBytePair, nBufWrite, nByte

   BEGIN SEQUENCE

      IF hInFile < 0
         BREAK
      ENDIF
      IF hOutFile < 0
         BREAK
      ENDIF

      WHILE nBufRead > 0

         cOutBuffer := ""
         nBufRead   := FREAD( hInFile, @cInBuffer, 0x400 )
         IF nBufRead > 0
            FOR nByte := 1 TO nBufRead STEP 2
               cBytePair  := SUBSTR( cInBuffer, nByte, 2 )
               cOutBuffer += RIGHT( cBytePair, 1 ) + LEFT( cBytePair, 1 )
            NEXT
            nBufWrite := FWRITE( hOutFile, cOutBuffer )
            IF nBufWrite < nBufRead
               BREAK
            ENDIF
         ENDIF

      ENDDO

      lSuccess := .Y.
    
   END SEQUENCE

   FCLOSE( hInFile )
   FCLOSE( hOutFile )

RETURN lSuccess
RichEditBox_LoadFile() calls RICHEDITBOX_STREAMIN(), which has been enhanced.

Now skips over byte order mark in Unicode text files.

http://kevincarmody.com/hmg/SOURCE/c_richeditbox.c - lines 198-263 (modified 207-209, 216-218, 233-250)

Code: Select all

/*
    Following function modified by Kevin Carmody, October 2015

    Now skips over byte order marks in Unicode text files.

    This function does not directly support UTF-16 BE text files.
    RichEditBox_LoadFile() supports it by using HMG_UTF16ByteSwap() to
    first convert it a UTF-16 BE file to UTF-16 LE and then calling this 
    function on the UTF-16 LE file.

    See
      RichEditBox_LoadFile() in SOURCE\h_richeditbox.prg
      HMG_UTF16ByteSwap() in SOURCE\h_UNICODE_String.prg
      _RichEditBox_DoMethod() in SOURCE\h_controlmisc.prg
      RTFLoadFile and LoadFile translations in SOURCE\i_window.ch
      RICHEDITFILE_* constants in SOURCE\i_richeditbox.ch
*/
//        RichEditBox_StreamIn ( hWndControl, cFileName, lSelection, nDataFormat )
HB_FUNC ( RICHEDITBOX_STREAMIN )
{
   HWND       hWndControl = (HWND)   HMG_parnl (1);
   TCHAR     *cFileName   = (TCHAR*) HMG_parc (2);
   BOOL       lSelection  = (BOOL)   hb_parl  (3);
   LONG       nDataFormat = (LONG)   hb_parnl (4);
   HANDLE     hFile;
   // Following 3 lines added by Kevin Carmody, October 2015
   BYTE       bUtf8Bom[3]; 
   BYTE       bUtf16Bom[2]; 
   DWORD      dwRead;
   EDITSTREAM es;
   LONG       Format;

   switch( nDataFormat )
   {
   // Comments in this switch block modified by Kevin Carmody, October 2015
      case 1:   Format = SF_TEXT; break; // ANSI or UTF-8 with BOM or mixed (UTF-8 BOM is removed, overlong UTF-8 is accepted, invalid UTF-8 is read as ANSI)
      case 2:   Format = ( CP_UTF8 << 16 ) | SF_USECODEPAGE | SF_TEXT; break; // UTF-8 without BOM (BOM is not removed)
      case 3:   Format = SF_TEXT | SF_UNICODE; break; // UTF-16 LE without BOM (BOM is not removed)
      case 4:   Format = SF_RTF;  break;
      // case 5, UTF-8 RTF, removed by Kevin Carmody, October 2015, because it never occurs
      default:  Format = SF_RTF; break;
   }

   if ( lSelection )
        Format = Format | SFF_SELECTION;

   if( ( hFile = CreateFile (cFileName, GENERIC_READ, FILE_SHARE_READ, NULL, OPEN_EXISTING, FILE_FLAG_SEQUENTIAL_SCAN, NULL )) == INVALID_HANDLE_VALUE )
   {   hb_retl (FALSE);
       return;
   }

   // Following switch block added by Kevin Carmody, October 2015
   switch( nDataFormat )
   {
      case 1:   break;
      case 2:   
         if ( ! ReadFile (hFile, bUtf8Bom, 3, &dwRead, NULL) ) // read past BOM if present
            hb_retl (FALSE);
         if ( ! ( dwRead == 3 && bUtf8Bom[0] == 0xEF && bUtf8Bom[1] == 0xBB && bUtf8Bom[2] == 0xBF ) )
            SetFilePointer (hFile, 0, 0, FILE_BEGIN);
         break;
      case 3:
         if ( ! ReadFile (hFile, bUtf16Bom, 2, &dwRead, NULL) ) // read past BOM if present
            hb_retl (FALSE);
         if ( ! ( dwRead == 2 && bUtf16Bom[0] == 0xFF && bUtf16Bom[1] == 0xFE ) )
            SetFilePointer (hFile, 0, 0, FILE_BEGIN);
         break;
      case 4:   break;
      default:  break;
   }
   es.pfnCallback = EditStreamCallbackRead;
   es.dwCookie    = (DWORD_PTR) hFile;
   es.dwError     = 0;

   SendMessage ( hWndControl, EM_STREAMIN, (WPARAM) Format, (LPARAM) &es );

   CloseHandle (hFile);

   if( es.dwError )
      hb_retl (FALSE);
   else
      hb_retl (TRUE);
}
RtfLoadFile calls DOMETHOD(), which has been enhanced.

Added xData variable to enable it to return a value from _RichEditBox_DoMethod(). xData is used to return a value from a Rich Edit method. This change parallels the xData variable in GetProperty() that is currently used to return a value from a GridEx, Tree, or Rich Edit property.

http://kevincarmody.com/hmg/SOURCE/h_controlmisc.prg - lines 8883, 8900-8902

Code: Select all

Function DoMethod ( Arg1 , Arg2 , Arg3 , Arg4 , Arg5 , Arg6 , Arg7 , Arg8 , Arg9 )
/*  
    Following line modified by Kevin Carmody, October 2015

    Added xData variable to enable it to return a value from 
      _RichEditBox_DoMethod(). xData is used to return a value from a Rich 
      Edit method. This change parallels the xData variable in GetProperty() 
      that is currently used to return a value from a GridEx, Tree, or Rich 
      Edit property.

    See
      GetProperty() above
      _GridEx_GetProperty(), _Tree_GetProperty(), _RichEditBox_GetProperty(), _RichEditBox_DoMethod() below
      RTFLoadFile, RTFSaveFile, LoadFile, SaveFile translations in SOURCE\i_window.ch
      RichEditBox_LoadFile() and RichEditBox_SaveFile() in SOURCE\h_richeditbox.prg
*/ 
Local xData, i, hWnd
Local cMacro, cControlType

IF _GridEx_DoMethod      ( Arg1 , Arg2 , Arg3 , Arg4 , Arg5 , Arg6 , Arg7 , Arg8 , Arg9 ) == .T.
   Return Nil
ENDIF

IF _Tree_DoMethod        ( Arg1 , Arg2 , Arg3 , Arg4 , Arg5 , Arg6 , Arg7 , Arg8 , Arg9 ) == .T.
   Return Nil
ENDIF

/*
    Following 2 lines modified by Kevin Carmody, October 2015

    These lines use the xData variable defined above to return a value from
    a Rich Edit method.
*/ 
IF _RichEditBox_DoMethod ( @xData, Arg1 , Arg2 , Arg3 , Arg4 , Arg5 , Arg6 , Arg7 , Arg8 , Arg9 ) == .T.
   Return xData
ENDIF
DoMethod() calls _RICHEDITBOX_DOMETHOD(), which has been enhanced.

Added xData argument to enable it to return a value from the LoadFile, RtfLoadFile, SaveFile, and RtfSaveFile methods. This change parallels the xData argument that is currently used in _GridEx_GetProperty(), _Tree_GetProperty(), and _RichEditBox_GetProperty().

http://kevincarmody.com/hmg/SOURCE/h_controlmisc.prg - line 10464

Code: Select all

/*
    Following line modified by Kevin Carmody, October 2015

    Added xData argument to enable it to return a value to
      _GetProperty(). xData is used to return a value from a Rich 
      Edit method. This change parallels the xData variable in 
      _RichEditBox_GetProperty().

    See
      GetProperty(), _GridEx_GetProperty(), _Tree_GetProperty(), 
        _RichEditBox_GetProperty() above
      RTFLoadFile, RTFSaveFile, LoadFile, SaveFile translations in SOURCE\i_window.ch
      RichEditBox_LoadFile() and RichEditBox_SaveFile() in SOURCE\h_richeditbox.prg
*/ 
Function _RichEditBox_DoMethod ( xData, Arg1 , Arg2 , Arg3 , Arg4 , Arg5 , Arg6 , Arg7 , Arg8 , Arg9 )
http://kevincarmody.com/hmg/SOURCE/h_controlmisc.prg - lines 10492-10494

Code: Select all

   /*
      Following two cases modified by Kevin Carmody, October 2015

      These cases use the xData argument defined above to return a value from 
      the Rich Edit methods RTFLoadFile, LoadFile, RTFSaveFile, and SaveFile.
   */ 
   CASE Arg3 == HMG_UPPER ("RTFLoadFile") .OR. Arg3 == HMG_UPPER ("LoadFile")
        xData := RichEditBox_LoadFile ( hWndControl, Arg4, Arg5, Arg6 )   // by default save in SF_RTF format
        RetVal := .T.

   CASE Arg3 == HMG_UPPER ("RTFSaveFile") .OR. Arg3 == HMG_UPPER ("SaveFile")
        xData := RichEditBox_SaveFile ( hWndControl, Arg4, Arg5, Arg6 )   // by default save in SF_RTF format
        RetVal := .T.
Enhanced rich edit control method RTFSAVEFILE()
  • Writes byte order marks to Unicode text files.
  • Supports UTF-16 BE text files (big endian Unicode text file).
  • The RTFUTF8 file type has been removed and the UTF-16 BE file type has been added. See below.
http://kevincarmody.com/hmg/INCLUDE/i_window.ch - lines 195-196, 203

Code: Select all

;; /*
      Following 2 lines modified by Kevin Carmody, October 2015

      They add the LoadFile and SaveFile methods to the rich edit box control
      as synonyms for the RTFLoadFile and RTFSaveFile methods.

      See 
        _RichEditBox_DoMethod() in SOURCE\h_controlmisc.prg
        RichEditBox_LoadFile() and RichEditBox_SaveFile() in SOURCE\h_richeditbox.prg
        RichEditBox_StreamIn() and RichEditBox_StreamOut() in SOURCE\c_richeditbox.c
        HMG_UTF16ByteSwap() in SOURCE\h_UNICODE_String.prg
   */ ;;
#xtranslate <w>. \<c\> . \<p:RTFLoadFile,RTFSaveFile,LoadFile,SaveFile\> (\<arg1\>,\<arg2\>,\<arg3\>) => DoMethod ( <"w">, \<"c"\> , \<"p"\> , \<arg1\> , \<arg2\>, \<arg3\> ) ;;
#xtranslate <w>. \<c\> . \<p:RTFLoadFile,RTFSaveFile,LoadFile,SaveFile\> (\<arg1\>,\<arg2\>) => DoMethod ( <"w">, \<"c"\> , \<"p"\> , \<arg1\> , \<arg2\> ) ;;
;; /*
      Following line added by Kevin Carmody, October 2015

      It allows the RTFLoadFile, RTFSaveFile, LoadFile, and SaveFile methods to be called with one argument,
      the file name.  The second argument, lSelection (RichEditBox_LoadFile() in h_richeditbox.prg), defaults to .F.
   */ ;;
#xtranslate <w>. \<c\> . \<p:RTFLoadFile,RTFSaveFile,LoadFile,SaveFile\> (\<arg1\>) => DoMethod ( <"w">, \<"c"\> , \<"p"\> , \<arg1\> ) ;;
http://kevincarmody.com/hmg/SOURCE/h_controlmisc.prg - lines 10496-10498

Code: Select all

   /*
      Following two cases modified by Kevin Carmody, October 2015

      These cases use the xData argument defined above to return a value from 
      the Rich Edit methods RTFLoadFile, LoadFile, RTFSaveFile, and SaveFile.
   */ 
   CASE Arg3 == HMG_UPPER ("RTFLoadFile") .OR. Arg3 == HMG_UPPER ("LoadFile")
        xData := RichEditBox_LoadFile ( hWndControl, Arg4, Arg5, Arg6 )   // by default save in SF_RTF format
        RetVal := .T.

   CASE Arg3 == HMG_UPPER ("RTFSaveFile") .OR. Arg3 == HMG_UPPER ("SaveFile")
        xData := RichEditBox_SaveFile ( hWndControl, Arg4, Arg5, Arg6 )   // by default save in SF_RTF format
        RetVal := .T.
The file types that RtfSaveFile accepts have been changed. See above under RtfLoadFile.

The new function GetRichEditFileType() returns a file type which can be used for the file type. See the description of GetRichEditFileType() below.

RtfSaveFile calls RICHEDITBOX_SAVEFILE(), which has been enhanced.
  • Writes byte order marks to Unicode text files.
  • Supports UTF-16 BE file type. This function supports UTF-16 BE text files by first calling RichEditBox_StreamOut() to generate a UTF-16 LE file and then calling HMG_UTF16ByteSwap() to convert the UTF-16 LE file to UTF-16 BE.
http://kevincarmody.com/hmg/SOURCE/h_richeditbox.prg - lines 490-515

Code: Select all

/*
    Following function modified by Kevin Carmody, October 2015

    Changes
      Writes byte order marks to Unicode text files.
      Supports UTF-16 BE text files (big endian Unicode text file).

    This function supports UTF-16 BE text files by first calling 
    RichEditBox_StreamOut() to generate a UTF-16 LE file and then calling 
    HMG_UTF16ByteSwap() to convert the UTF-16 LE file to UTF-16 BE.

    See
      RICHEDITFILE_* constants in SOURCE\i_richeditbox.ch
      RTFSaveFile and SaveFile translations in SOURCE\i_window.ch
      _RichEditBox_DoMethod() in SOURCE\h_controlmisc.prg
      RichEditBox_StreamOut() in SOURCE\c_richeditbox.c
      HMG_UTF16ByteSwap() in SOURCE\h_UNICODE_String.prg
*/ 

*-----------------------------------------------------------------------------*
FUNCTION RichEditBox_SaveFile( hWndControl, cFile, lSelection, nType )
*-----------------------------------------------------------------------------*
LOCAL lSuccess := .N.
LOCAL cTempFile

   IF ValType( lSelection ) <> "L"
      lSelection := .F.
   ENDIF
   
   IF ValType( nType ) <> "N"
      nType := RICHEDITFILE_RTF
   ENDIF

   IF nType == RICHEDITFILE_TEXTUTF16BE
      cTempFile := GETTEMPFOLDER() + "_RichEditLoadFile.txt"
      lSuccess := RichEditBox_StreamOut( hWndControl, cTempFile, lSelection, RICHEDITFILE_TEXTUTF16LE )
      IF lSuccess
         lSuccess  := HMG_UTF16ByteSwap( cTempFile, cFile )
      ENDIF
      DELETE FILE ( cTempFile )
   ELSE
      lSuccess := RichEditBox_StreamOut( hWndControl, cFile, lSelection, nType )
   ENDIF

RETURN lSuccess
RichEditBox_SaveFile() calls HMG_UTF16BYTESWAP(), a new function.

Converts between UTF-16 LE and UTF-16 BE files.

http://kevincarmody.com/hmg/SOURCE/h_UNICODE_String.prg - lines 454-498

Code: Select all

/*
    Following function added by Kevin Carmody, October 2015

    Converts between UTF-16 LE and UTF-16 BE files.

    See 
      RichEditBox_LoadFile() and RichEditBox_SaveFile() in SOURCE\h_richeditbox.prg
      RICHEDITFILE_* constants in SOURCE\i_richeditbox.ch
      RichEditBox_StreamIn() and RichEditBox_StreamOut() in SOURCE\c_richeditbox.c
      _RichEditBox_DoMethod() in SOURCE\h_controlmisc.prg
      RTFLoadFile, LoadFile, RTFSaveFile, SaveFile translations in SOURCE\i_window.ch
*/

*-----------------------------------------------------------------------------*
FUNCTION HMG_UTF16ByteSwap( cInFile, cOutFile )
*-----------------------------------------------------------------------------*

LOCAL hInFile   := FOPEN( cInFile , FO_READ )
LOCAL hOutFile  := FCREATE( cOutFile )
LOCAL cInBuffer := SPACE( 0x400 )
LOCAL nBufRead  := 1
LOCAL lSuccess  := .N.
LOCAL cOutBuffer, cBytePair, nBufWrite, nByte

   BEGIN SEQUENCE

      IF hInFile < 0
         BREAK
      ENDIF
      IF hOutFile < 0
         BREAK
      ENDIF

      WHILE nBufRead > 0

         cOutBuffer := ""
         nBufRead   := FREAD( hInFile, @cInBuffer, 0x400 )
         IF nBufRead > 0
            FOR nByte := 1 TO nBufRead STEP 2
               cBytePair  := SUBSTR( cInBuffer, nByte, 2 )
               cOutBuffer += RIGHT( cBytePair, 1 ) + LEFT( cBytePair, 1 )
            NEXT
            nBufWrite := FWRITE( hOutFile, cOutBuffer )
            IF nBufWrite < nBufRead
               BREAK
            ENDIF
         ENDIF

      ENDDO

      lSuccess := .Y.
    
   END SEQUENCE

   FCLOSE( hInFile )
   FCLOSE( hOutFile )

RETURN lSuccess
RichEditBox_SaveFile() calls RICHEDITBOX_STREAMOUT(), which has been enhanced.

Now writes byte order mark in Unicode text files.

http://kevincarmody.com/hmg/SOURCE/c_richeditbox.c - lines 294-349 (modified 303-305, 312-314, 329-336)

Code: Select all

/*
    Following function modified by Kevin Carmody, October 2015

    Now writes byte order mark in Unicode text files.

    This function does not directly support a UTF-16 BE text file.
    RichEditBox_SaveFile() supports it by first calling this function to
    generate a UTF-16 LE file and then calling HMG_UTF16ByteSwap() to
    convert the UTF-16 LE file to UTF-16 BE.

    See
      RichEditBox_SaveFile() in SOURCE\h_richeditbox.prg
      HMG_UTF16ByteSwap() in SOURCE\h_UNICODE_String.prg
      _RichEditBox_DoMethod() in SOURCE\h_controlmisc.prg
      RTFSaveFile and SaveFile translations in SOURCE\i_window.ch
      RICHEDITFILE_* constants in SOURCE\i_richeditbox.ch
*/
//        RichEditBox_StreamOut ( hWndControl, cFileName, lSelection, nDataFormat )
HB_FUNC ( RICHEDITBOX_STREAMOUT )
{
   HWND       hWndControl = (HWND)   HMG_parnl (1);
   TCHAR     *cFileName   = (TCHAR*) HMG_parc (2);
   BOOL       lSelection  = (BOOL)   hb_parl  (3);
   LONG       nDataFormat = (LONG)   hb_parnl (4);
   HANDLE     hFile;
   // Following 3 lines added by Kevin Carmody, October 2015
   BYTE       bUtf8Bom[3]  = {0xEF, 0xBB, 0xBF};
   BYTE       bUtf16Bom[2] = {0xFF, 0xFE}; 
   DWORD      dwWritten;
   EDITSTREAM es;
   LONG       Format;

   switch( nDataFormat )
   {
   // Comments in this switch block modified by Kevin Carmody, October 2015
      case 1:   Format = SF_TEXT; break; // ANSI (non-ANSI characters are converted to question marks)
      case 2:   Format = ( CP_UTF8 << 16 ) | SF_USECODEPAGE | SF_TEXT; break; // UTF-8 without BOM
      case 3:   Format = SF_TEXT | SF_UNICODE; break; // UTF-16 LE without BOM
      case 4:   Format = SF_RTF;  break;
      // case 5, UTF-8 RTF, removed by Kevin Carmody, October 2015, because it never occurs
      default:  Format = SF_RTF; break;
   }

   if ( lSelection )
        Format = Format | SFF_SELECTION;

   if( ( hFile = CreateFile (cFileName, GENERIC_WRITE, FILE_SHARE_WRITE, NULL, CREATE_ALWAYS, FILE_ATTRIBUTE_NORMAL, NULL )) == INVALID_HANDLE_VALUE )
   {   hb_retl (FALSE);
       return;
   }

   // Following switch block added by Kevin Carmody, October 2015
   switch( nDataFormat )
   {
      case 1:   break;
      case 2:   WriteFile( hFile, bUtf8Bom, 3, &dwWritten, NULL ); break; // write UTF-8 BOM at head of file
      case 3:   WriteFile( hFile, bUtf16Bom, 2, &dwWritten, NULL ); break; // write UTF-16 LE BOM at head of file
      case 4:   break;
      default:  break;
   }
   es.pfnCallback = EditStreamCallbackWrite;
   es.dwCookie    = (DWORD_PTR) hFile;
   es.dwError     = 0;

   SendMessage ( hWndControl, EM_STREAMOUT, (WPARAM) Format, (LPARAM) &es );
   
   CloseHandle (hFile);

   if( es.dwError )
      hb_retl (FALSE);
   else
      hb_retl (TRUE);
}
RtfSaveFile calls DOMETHOD(), which has been enhanced. See RtfLoadFile above.
Other new function GETRICHEDITFILETYPE()
  • Returns the file type of an RTF file or text file, which can be used as the file type parameter in the LoadFile, RtfLoadFile, SaveFile, and RtfSaveFile methods. This function examines the first few bytes of the file to see if there is an RTF header or Unicode byte order mark.
  • When there is no RTF header or byte order mark: If the optional lUtf8Test argument is .T., then the whole file is scanned to see if it is in UTF-8 format. If so, then the file type is returned as UTF-8. Otherwise the file type is returned as ANSI.
http://kevincarmody.com/hmg/SOURCE/h_richeditbox.prg - lines 587-646

Code: Select all

/*
    Following function added by Kevin Carmody, October 2015

    This function returns the file type of an RTF file or text file, which 
    can be used as the file type parameter in the LoadFile, RtfLoadFile, 
    SaveFile, and RtfSaveFile methods.  This function examines the first few 
    bytes of the file to see if there is an RTF header or Unicode byte order 
    mark.
      
    When there is no RTF header or byte order mark:  If the optional 
    lUtf8Test argument is .T., then the whole file is scanned to see if it is 
    in UTF-8 format.  If so, then the file type is returned as UTF-8.  
    Otherwise the file type is returned as ANSI.

    See
      RICHEDITFILE_* constants in SOURCE\i_richeditbox.ch
      RTFLoadFile, LoadFile, RTFSaveFile, SaveFile translations in SOURCE\i_window.ch
      _RichEditBox_DoMethod() in SOURCE\h_controlmisc.prg
      RichEditBox_StreamIn() and RichEditBox_StreamOut() in SOURCE\c_richeditbox.c
      HMG_UTF16ByteSwap() and HMG_IsUtf8() in SOURCE\h_UNICODE_String.prg

*/ 

*-----------------------------------------------------------------------------*
FUNCTION GetRichEditFileType ( cFile, lUtf8Test )
*-----------------------------------------------------------------------------*

LOCAL hFile    := FOPEN( cFile, FO_READ )
LOCAL cBuffer  := SPACE( 5 )
LOCAL nBufRead := 0
LOCAL nType    := 0

/*
  The following code block tests whether an umnarked text file contains
  valid UTF-8 text with non-ASCII characters.
*/

LOCAL bIsUtf8NonAscii := {||
   LOCAL lUtf8NonAscii := .N.
   LOCAL cPartial := ''
      cBuffer  := SPACE( 0x400 )
      nBufRead := 1
      BEGIN SEQUENCE
         WHILE nBufRead > 0
            nBufRead := FREAD( hFile, @cBuffer, 0x400 )
            IF nBufRead > 0 .AND. HMG_IsUtf8( cPartial + cBuffer, .N., .Y., @cPartial )
               lUtf8NonAscii := .Y.
               BREAK
            ENDIF
         ENDDO
         IF ! EMPTY( cPartial )
            lUtf8NonAscii := .N.
         ENDIF
      END SEQUENCE
   RETURN lUtf8NonAscii
   }

   BEGIN SEQUENCE

      IF hFile < 0
         BREAK
      ENDIF
      nBufRead := FREAD( hFile, @cBuffer, 5 )
      DO CASE
      CASE nBufRead >= 5 .AND. LEFT( cBuffer, 5 ) == "{\rtf"
         nType := RICHEDITFILE_RTF
      CASE nBufRead >= 3 .AND. LEFT( cBuffer, 3 ) == E"\xEF\xBB\xBF"
         nType := RICHEDITFILE_TEXTUTF8
      CASE nBufRead >= 2 .AND. LEFT( cBuffer, 2 ) == E"\xFF\xFE"
         nType := RICHEDITFILE_TEXTUTF16LE
      CASE nBufRead >= 2 .AND. LEFT( cBuffer, 2 ) == E"\xFE\xFF"
         nType := RICHEDITFILE_TEXTUTF16BE
      CASE ! EMPTY( lUtf8Test ) .AND. bIsUtf8NonAscii:EVAL( )
         nType := RICHEDITFILE_TEXTUTF8
      OTHERWISE
         nType := RICHEDITFILE_TEXTANSI
      ENDCASE

   END SEQUENCE

   FCLOSE( hFile )

RETURN nType   
Other enhanced function HMG_ISUTF8()
  • When cString is the empty string or is all ASCII: If the optional lAllowASCII argument is .T., then the return value is .T. Otherwise the return value is .F.
  • When cString is valid UTF-8 except that it ends with an incomplete UTF-8 byte sequence: If the optional lAllowPartial argument is .T., then the return value is .T. and the incomplete byte sequence is passed back through the cPartial argument. Otherwise the return value is .F. This is useful when cString is a file buffer.
  • The return value is .F. if cString encodes any code point greater than the Unicode limit of 0x10FFFF, or if it encodes any surrogate character, or if it contains an overlong UTF-8 byte sequence. One overlong sequnce is accepted, the 2-byte overlong sequence for the null character (0xC0 0x80), which is commonly accepted by UTF-8 parsers.
http://kevincarmody.com/hmg/SOURCE/h_UNICODE_String.prg - lines 209-315

Code: Select all

/*
    Following function modified by Kevin Carmody, October 2015

    Changes
      When cString is the empty string or is all ASCII:  If the optional 
        lAllowASCII argument is .T., then the return value is .T.  Otherwise 
        the return value is .F.
      When cString is valid UTF-8 except that it ends with an incomplete 
        UTF-8 byte sequence:  If the optional lAllowPartial argument is .T., 
        then the return value is .T. and the incomplete byte sequence is 
        passed back through the cPartial argument.  Otherwise the return 
        value is .F.  This is useful when cString is a file buffer.
      The return value is .F. if cString encodes any code point greater than 
        the Unicode limit of 0x10FFFF, or if it encodes any surrogate 
        character, or if it contains an overlong UTF-8 byte sequence.  One 
        overlong sequnce is accepted, the 2-byte overlong sequence for the 
        null character (0xC0 0x80), which is commonly accepted by UTF-8 
        parsers.

    See
      GetRichEditFileType() in SOURCE\h_richeditbox.prg
      HB_STRISUTF8 in \src\rtl\strutf8.c in Harbour source 
      is_utf8() at http://stackoverflow.com/questions/1031645/how-to-detect-utf-8-in-plain-c
*/ 

FUNCTION HMG_IsUTF8( cString, lAllowASCII, lAllowPartial, cPartial )

LOCAL lASCII  := .T.
LOCAL lCheck  := .F.
LOCAL lUTF8   := .T.
LOCAL nCBytes := 0
LOCAL nRBytes := 0
LOCAL cChar, nChar, nLead

   IF lAllowASCII == NIL
      lAllowASCII := .F.
   ENDIF
   IF lAllowPartial == NIL
      lAllowPartial := .F.
   ENDIF

   BEGIN SEQUENCE

      FOR EACH cChar IN cString

         nChar := HB_BCODE( cChar )

         IF nCBytes > 0 // check continuation bytes

            IF nChar < 0x80 .OR. nChar > 0xBF // disallow invalid continuation byte
               BREAK
            ENDIF
            IF lCheck // check first continuation byte for partially valid lead byte
               SWITCH nLead
               CASE 0xC0 // disallow 2-byte overlongs except overlong null character
                  IF nChar != 0x80
                     BREAK
                  ENDIF
                  EXIT
               CASE 0xE0 // disallow 3-byte overlongs
                  IF nChar < 0xA0
                     BREAK
                  ENDIF
                  EXIT
               CASE 0xED // disallow surrogates
                  IF nChar > 0x9F
                     BREAK
                  ENDIF
                  EXIT
               CASE 0xF0 // disallow 4-byte overlongs
                  IF nChar < 0x90
                     BREAK
                  ENDIF
                  EXIT
               CASE 0xF4 // disallow 4-byte sequences beyond end of Unicode
                  IF nChar > 0x8F
                     BREAK
                  ENDIF
                  EXIT
               ENDSWITCH
               lCheck := .F.
            ENDIF
            nCBytes --
            nRBytes ++

         ELSEIF nChar >= 0x80 // check lead byte

            lASCII := .F.
            nLead := nChar
            IF nLead < 0xC0 .OR. nLead == 0xC1 .OR. nLead > 0xF4 // disallow invalid lead bytes
               BREAK
            ENDIF
            lCheck := ( nLead == 0xC0 .OR. nLead == 0xE0 .OR. nLead == 0xED .OR. ;
              nLead == 0xF0 .OR. nLead == 0xF4 ) // partially valid lead bytes

            DO CASE // compute number of continuation bytes
            CASE nLead <= 0xDF
              nCBytes := 1
            CASE nLead <= 0xEF
              nCBytes := 2
            OTHERWISE
              nCBytes := 3
            ENDCASE
            nRBytes := 1

         ENDIF

      NEXT

   RECOVER

      lUTF8 := .F.

   END SEQUENCE

   IF lUTF8 .AND. nCBytes > 0
      IF lAllowPartial
         cPartial := RIGHT( cString, nRBytes )
      ELSE
         lUTF8 := .F.
      ENDIF
   ELSE
      IF lAllowPartial
         cPartial := ''
      ENDIF
   ENDIF

   IF ! lAllowASCII .AND. lASCII
      lUTF8 := .F.
   ENDIF

RETURN lUTF8
Overhauled the Rich Edit demo The rich edit demo now has the following features:
  • Main menu
  • List of recently used files
  • Resizable windows
  • Ctrl-B, Ctrl-I, Ctrl-U supported
  • File name in title, file name and page in status bar
  • Modified flag, caps lock, num lock, insert status on status bar
  • Window size, font name and size, file locations, file filters, recently used file names stored in registry
  • Paragraph numbering
  • Read and write text files
  • Many other enhancements
The compiled executable is at http://kevincarmody.com/hmg/SAMPLES/Con ... x/demo.exe.

The rich edit demo uses all the enhancements described above (except the HasNonAnsiChars property, SelPasteSpecial method, and HMG_UTF8InsertBOM function).

Source files changed: Source files added: Source files deleted:
Last edited by kcarmody on Thu Oct 08, 2015 12:27 pm, edited 1 time in total.
User avatar
serge_girard
Posts: 3158
Joined: Sun Nov 25, 2012 2:44 pm
DBs Used: 1 MySQL - MariaDB
2 DBF
Location: Belgium
Contact:

Re: RichEditBox change proposal, October 2015

Post by serge_girard »

Thank you very much Kevin !
Very useful. I couldn't download the EXE for my AVAST 'thinks' it is a virus...

Serge
There's nothing you can do that can't be done...
User avatar
kcarmody
Posts: 152
Joined: Tue Oct 07, 2014 11:13 am
Contact:

Re: RichEditBox change proposal, October 2015

Post by kcarmody »

serge_girard wrote:Thank you very much Kevin !
Very useful. I couldn't download the EXE for my AVAST 'thinks' it is a virus...

Serge
Avast has given false positives on this demo in the past. See http://hmgforum.com/viewtopic.php?f=43&t=4471&start=15. I have AVG and it has never reported any problem with the demo.

Avast in general has a high rate of false positives. See http://www.av-comparatives.org/wp-conte ... 503_en.pdf

Kevin
User avatar
serge_girard
Posts: 3158
Joined: Sun Nov 25, 2012 2:44 pm
DBs Used: 1 MySQL - MariaDB
2 DBF
Location: Belgium
Contact:

Re: RichEditBox change proposal, October 2015

Post by serge_girard »

Thanks for the info Kevin, I know what to do!

Serge
There's nothing you can do that can't be done...
Post Reply