unicode,vb6,utf-16 , VB & Chinese string

VB & Chinese string


Tag: unicode,vb6,utf-16

I am trying to capture a Chinese text from a website using VB6

simple code used to do this as below, working good with the English sites

    Private Function RequestText(sURL, Optional sMethod = "POST")
    'You may have caching issues using GET
    Set XMLHTTP = CreateObject("microsoft.XMLHTTP")
    sMethod = UCase(sMethod)
    XMLHTTP.Open sMethod, sURL, False
    XMLHTTP.send (Null) '"x=x"

    RequestText = XMLHTTP.responseText
    Set XMLHTTP = Nothing
End Function

Private Sub cmdText_Click()
    Dim html as string
    html = RequestText("http://url")
    Clipboard.SetText html
    MsgBox "Done"
End Sub

When trying to paste the text to word, notepad or db, the Chinese characters appears as ???? Any solutions for this?


VB6 is ANSI when calling API calls as WIN95 didn't support unicode. In COM and internally it's Unicode. Therefore any API call, as the clipboard requires, your TEXT will be converted to ANSI. Therefore don't use TEXT. XMLHTTP has several formats to choose from.

Internet Explorer can access the clipboard. Use it to do it.

This sample code reads the clipboard. NB you have to navigate to ANY local file to avoid security prompts.

Set Arg = WScript.Arguments
set WshShell = createObject("Wscript.Shell")
Set Inp = WScript.Stdin
Set Outp = Wscript.Stdout
Sub Clip
    Set ie = CreateObject("InternetExplorer.Application") 
    ie.Visible = 0
    ie.Navigate2 FilterPath & "Filter.html"
        wscript.sleep 100
    Loop until ie.document.readystate = "complete"  
    If IsNull(txt) = true then 
        outp.writeline "No text on clipboard"
        outp.writeline txt
    End If
End Sub


How is Levenshtein Distance calculated on Simplified Chinese characters?

I have 2 queries: query1:你好世界 query2:你好 When i run this code using the python library Levenshtein: from Levenshtein import distance, hamming, median lev_edit_dist = distance(query1,query2) print lev_edit_dist I get an output of 12. Now the question is how is the value 12 derived? Because in terms of strokes difference, theres...

Why Unicode characters are not displayed properly in terminal with GCC?

I've written a small C program: #include <stdio.h> #include <stdlib.h> #include <locale.h> int main() { wprintf(L"%s\n", setlocale(LC_ALL, "C.UTF-8")); wchar_t chr = L'┐'; wprintf(L"%c\n", chr); } Why doesn't this print the character ┐ ? Instead it prints gibberish. I've checked: tried compiling without setlocale, same result the terminal itself can print...

How to display Arabic unicode text in page that retrieved from database

I need your help in displaying some Arabic text which is stored in a variable in the xhtml page. I have configured my project in jdeveloper to include UTF-8 in the properties and the Arabic text is displayed correctly. I have a variable called bankName and it has the unicode...

Convert 10009335357561071 to hex string

Public Function MyMod(a As Double, b As Double) As Double MyMod = a - Int(a / b) * b End Function This code doesn't work as it doesn't correctly show the remainder do be able to then calculate HEX. Correct : 10009335357561071 / 16 = 62558345984756.69 VB6 MyMod returns 0...

Font is right. Why can't I get this unicode character to display in this C# console app?

I have added a font to cmd. (DejaVu Sans Mono) This techrepublic link has the registry hack to add a font to cmd, one can do it for some fonts such as that one. The font has a unicode non ascii glyph and I can paste that unicode non ascii...

Working with characters based on their UTF-8 hex codes

I'm working on something that will read a user's text messages and export them to a csv file, which they can then download. The messages are being retrieved from a third-party web interface—I am essentially using js to grab the html of each message and compiling it as needed. The...

How to specify string variables as unicode strings for pattern and text in regex matching?

>>> import re >>> re.match(u'^[一二三四五六七]、', u'一、') If the pattern and the text are stored in variables (for example, they were read from text files), >>> myregex='^[一二三四五六七]、' >>> mytext='一、' How shall I specify myregex and mytext to re.match, in the same way as re.match(u'^[一二三四五六七]、', u'一、')? Thanks....

How to find UTF-8 reference of a composite unicode character

At work, i have this issue where i need to find the UTF-8 reference of a composite unicode character. The character in question is a "n" with a "^" on top : n̂. This is represented in unicode by the character "n" (U+006E) followed by the circumflex accent (U+0302). What...

unicode converting in RestTemplate in Spring

My aim is getting user info by accessToken using facebook api.I get response but email in this response is like this: aaaaaa\u0040mail.com. For converting i add some properties but this doesn't work RestTemplate restTemplate = new RestTemplate(); restTemplate.getMessageConverters().add(0, new StringHttpMessageConverter(Charset.forName("UTF-8"))); String facebook = restTemplate.getForObject( "https://graph.facebook.com/me?access_token=" + facebookAccessToken, String.class); How can...

Export (Android/Java) string data in with extended characters for import into Excel

I need to export string data that includes the 'degrees' symbol ("\u00B0"). This data is exported as a csv text file with UTF-8 encoding. As would be expected, the degrees symbol is encoded as two characters (0xC2, 0xB0) within the java (unicode) string. When the CSV file is imported into...

Displaying unicode characters in Python 3

I have issues with displaying Unicode characters. As an output I have this list (only on online IDEs): [u'\u0413', u'\0434', u'\043b'] How can I convert this sequence to normally visible text? I have # -*- coding: utf-8 -*- in header and also each string marked as Unicode like u'String' I...

Need to convert Java String with è to \u00E8 using Java

I have a Java String object which contains a word like "resumè" or for that matter any word with any international character in it. What I want to do is to convert this to encode the non ASCII characters in an ASCII string like "resum\u00E8". How do I do this...

Team Foundation 2012 not recognising changes in vb6 app

I'm using Team Foundation 2012 to provide source control for a VB6 (yes, I know) project. On a newly set up machine (installed Team Explorer 2012 and TFS Power Tools 2012), TFS does not seem to be properly noticing the changes. The local Team Explorer says it is connected to...

Change parameters On error in vb6

I am working with vb6 and automating some tasks to be performed across a large number of spreadsheets. The issue is setting my worksheet. Most of the syntax on the workbooks I am working with is the same however there are some where the sheets have different names. Currently using...

How to create a Persian file.txt and then explode it?

I have a lot Persian text and I want explode it, I store my text in a file.txt. (So i have a file.text containing Persian text). Now my problem is charset. When i save the text into file.text, it give me a error: This file contains characters in Unicode format...

mysql() refuses to read characters in php

I copied some code from word into my database which is causing problems when I want to retrieve them with mysql() in php. I've identified what the problem is, dashes like this one is causing an error retrieving it: Tue – Thur: I have to convert the dash to the...

Search for unicode values in character string

I am trying to identify unique unicode values in a data frame composed of character strings. I have tried using the grep function, however I encounter the following error Error: '\U' used without hex digits in character string starting ""\U" A example data frame time sender message 1 2012-12-04 13:40:00...

Unicode in Android

i have a little problem by printing the complement symbol in android. char c = '\u2216'; // should be the unicode for complement textView2.setText(c); // gives out "" nothing // if i take c = '\u2229' // it works But why i cant print out the complement symbol, where is...

Change lowercase and uppercase of characters in java

If I want to create a dictionary where the user can create a custom alphabet (that still uses unicode) Is there a way to change lowercase and uppercase mapping of the characters? Let's say I want the lowercase of 'I' to be 'ı' instead of 'i' or upperCase of 'b'...

Why can't I add a Dog Face (u+1f436) field to my object without using a String? [duplicate]

This question already has an answer here: What characters are valid for JavaScript variable names? 12 answers This: var foo = { 🐶 : true //Truely adorable }; Gives me an Illegal Character error on Firefox and Chrome. However, var foo = { '🐶' : true }; Works perfectly....

Is there any way to check whether an ordinal position contains a character or is empty?

If I print a character using its ordinal number with unichr(orninal) and the ordinal position does not contain any valid character, the result will be as: >>> print unichr(0x0c80) ಀ Now I want to filter such null characters from a string and I tried str.encode('utf-8', errors='ignore') as: >>> print ''.join([unichr(i)...

Parsing string containing Unicode character names

I have a string >>> s u'M\\N{AMPERSAND}M\\N{APOSTROPHE}s' >>> print s M\N{AMPERSAND}M\N{APOSTROPHE}s How do I turn it into M&M's?...

Python 3 UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d

I want to make search engine and I follow tutorial in some web. I want to test parse html from bs4 import BeautifulSoup def parse_html(filename): """Extract the Author, Title and Text from a HTML file which was produced by pdftotext with the option -htmlmeta.""" with open(filename) as infile: html =...

Characters and Strings in Swift

Reading the documentation and this answer, I see that I can initialize a Unicode character in either of the following ways: let narrowNonBreakingSpace: Character = "\u{202f}" let narrowNonBreakingSpace = "\u{202f}" As I understand, the second one would actually be a String. And unlike Java, both of them use double quotes...

Why is executing Java code in comments with certain Unicode characters allowed?

The following code produces the output "Hello World!". (No really, try it) public static void main(String... args) { // The comment below is no typo. // \u000d System.out.println("Hello World!"); } The reason for this is that the Java compiler parses the Unicode character \u000d as a new line and gets...

Behaviour unicode string in python

I have seen this question I have doubts about how can I convert a var to unicode on running time ? Is it right use unicode function ? Are there other way to convert a string on running time ? print(u'Cami\u00f3n') # prints with right special char name=unicode('Cami\u00f3n') print(name) #...

Cyrllic characters in SVG font

You will have to bear with me with this, as I know very little about encodings so may be asking something very simple/obvious. I am working with some SVG fonts in browser-side Javascript and I need to pull out some information on Cyrillic characters grammatically. I am doing this currently...

Why python 2.7 on Windows need a space before unicode character when print?

I use cmd Windows, chcp 65001, this is my code: print u'\u0110 \u0110' + '\n' Result: (a character cmd can't display) (character what i want) Traceback (most recent call last): File "b.py", line 26, in <module> print u'\u0110 \u0110' IOError: [Errno 2] No such file or directory But, when i...

selenium webdriver - xpath locator not working if element's text contains Unicode Characters

I'm trying to select an option contained inside a menu. It's not a select menu, but it's styled to appear as such. Anyway, if the text contained inside the menu is in English, I can select it ok. Trouble is, the text I need to select is french so it...

SQL query to return maximum values from multiple columns

I have the following tables in a SQL Server 2008 R2 database: Customers ========= CustID CustName ====== ======== 0 NULL 1 A 2 B 3 C InterestingCustomers ==================== CustID ====== 0 1 3 Orders ====== CustID OrderID InvoiceTotal Discount ====== ======= ============ ======== 0 NULL 2000 NULL 0 100 NULL...

GetWindowsDirectory() API returns wrong (vba\vb6)

on my Windows-Terminal user, I'm trying to have two application point to the same Windows directory, one written in VBA one in VB6. When calling the GetWindowsDirectory() API from VB6 it returns the correct path C:\documents and settings\%user%\Windows When calling it from VBA macro, it returns C:\Windows Notice that same...

Replace unicode characters with characters (Javascript)

Take for example the following string: &#8220;A profile of Mr. T, the A Team&#8217;s most well known member.&#8221; How do I use javascript replace the unicode character encodings and convert that to the following: "A profile of Mr. T, the A Team's most well known member."...

If statement not working in vb6

I want to make a program which could close a window with the title "Personalization" in vb6. The problem is that the if statement is not working.Here's my code(it only finds a window named "Personalization" without closing it): Option Explicit Private Sub Command1_Click() Timer1.Enabled = Not Timer1.Enabled End Sub Private...

ctrl+G in erl doesn't work

I'm trying to interconnect erlang nodes, but entering ctrl+G doesn not work: Eshell V6.4.1 (abort with ^G) 1> ^G Eshell V6.4.1 (abort with ^G) 1> ^G Eshell V6.4.1 (abort with ^G) 1> ^G Eshell V6.4.1 (abort with ^G) any idea why this can happen? I was thinking about locale settings,...

Python length of unicode string confusion

There's been quite some help around this already, but I am still confused. I have a unicode string like this: title = u'😉test' title_length = len(title) #5 But! I need len(title) to be 6. The clients expect it to be 6 because they seem to count in a different way...

Is executing C++ code in comments with certain Unicode characters allowed, like in Java?

I know that executing Java code in comments with certain Unicode characters is allowed. Please see this question for further clarification Executing Java code in comments. So was just curious to know if C++ has such features?

Regular Expression for whole world

First of all, I use C# 4.0 to parse the code of a VB6 application. I have some old VB6 code and about 500+ copies of it. And I use a regular expression to grab all kinds of global variables from the code. The code is described as "Yuck" and...

PHP - SQL query containing unicode is returning NULL for some reason

I am trying to run this query: SELECT trans FROM `dictionary` WHERE `word` LIKE 'Çiçek' like this (relevant code): function gettranslation($word){ return $this->query("SELECT trans FROM `dictionary` WHERE `word` LIKE '$word'"); } function query($query){ $result=mysqli_query($this->conn, "set character_set_results='utf8'"); $result=mysqli_query($this->conn, $query); return $row = mysqli_fetch_row($result)[0]; } My mySQL table is made like this:...

JavaCC and Unicode issue. Why \u696d cannot be managed in JavaCC although it belong to the range “\u4e00”-“\u9fff”

We're trying to use JavaCC as a parser to parse source code which is in UTF-8( the language is Japanese). In JavaCC, we have a declaration like: < #LETTER: [ "\u0024", "\u0041"-"\u005a", "\u005f", "\u0061"-"\u007a", "\u00c0"-"\u00d6", "\u00d8"-"\u00f6", "\u00f8"-"\u00ff", "\u0100"-"\u1fff", "\u3040"-"\u318f", "\u3300"-"\u337f", "\u3400"-"\u3d2d", "\u4e00"-"\u9fff", "\uf900"-"\ufaff" ] > If it meets a string...

Convert unicode URL to ASCII

I'm writing a PHP application that accepts an URL from the user, and then processes it with by making some calls to binaries with system()*. However, to avoid many complications that arise with this, I'm trying to convert the URL, which may contain Unicode characters, into ASCII characters. Let's say...

Create unicode character with pack

I am trying to understand how Perl handles unicode. use feature qw(say); use strict; use warnings; use Encode qw(encode); say unpack "H*", pack("U", 0xff); say unpack "H*", encode( 'UTF-8', chr 0xff ); Output: ff c3bf Why do I get ff and not c3bf when using pack ?...

█ character string indexed in python

I'm trying to get the index of 'J' in a string that is similar to myString = "███ ███ J ██" so I use myString.find('J') but it returns a really high value and if I replace '█' by 'M' or another character of the alphabet I get a lower value....

Java Unicode character displaying box instead of Runic letter

I am trying to draw random Nordic runes in a little Java game, but all I'm getting back is a square character. public class MyComponent extends JComponent { public void paintComponent(Graphics g) { String s = "\u16A8"; g.drawString(s,50,50); } } What the character should be displaying: https://en.wikipedia.org/wiki/Ansuz_(rune) What it's actually...

Formatting Errors with tail

How can I correctly parse this file through tail, without formatting errors? I am using tail within cygwin to parse the last ten lines of two files. One file parses through correctly, the other contains a space between every character. $ tail file2.txt -n 4 22/06/2015 12:28 - Decompressing and...

How to convert vb6 'vbFromUnicode' to string in C#

I got as code in VB6, which need to be converted in C#. I have googled it but didn't get any concrete answer. VB code: Dim strTemp = StrConv(strTemp , vbFromUnicode) I tried to do like this in c#: var strTemp = System.Runtime.InteropServices.Marshal.StringToBSTR(strTemp); I think this is not correct. Any...

java convert a english letter to unicode [closed]

I want to know how to create a program that converts a input english letter into a unicode decimal? For example, I enter the letter E, it will output 69. sth like that. I have already tried a simple casting char to int, but don't know how to create an...

Why/how does the browser decide ☃.net goes to xn--n3h.net

If we type into firefox or chrome http://☃.net/ It takes us to http://xn--n3h.net/ Which is a mirror of unicodesnowmanforyou.com What I don't understand is by what rules the unicode snowman can decode to xn--n3h, it doesn't look anything like utf-8 or urlencoding. I think I found a hint while mucking...

Punycode for Unicode query parameter

I am trying encode some Unicode URLs with Punycode. These URLs have a query parameter that contains non-ASCII characters, for example: https://en.wiktionary.org/w/index.php?title=Clœlia&printable=yes The problem is, when I try to do it in Java, the resulting URL is wrong: String link = "https://en.wiktionary.org/w/index.php?title=Clœlia&printable=yes"; link = IDN.toASCII(link); // -> link = http://en.wiktionary.org/w/index.xn--php?title=cllia&printable=yes-hgf...

Negate a specific group in regular expressions

How do you get a string not containing a specific group? (?:[0-9-+*/()x]|abs|pow|ln|pi|e|a?(sin|cos|tan)h?)+ Above string is the expression for mathematical expressions. How do you get the string that is not a mathematical expression? Example:WIDTH+LENGTH*abs(2) Supposed Output: WIDTH LENGTH...

How can I add an icon to select box choices?

Basically I have a text box with a modifier dropdown. I would like an icon of the current choice to display when chosen. The problem with the current set-up (using UNICODE) is that the icons do not always display, such as on google chrome (unless the specific fonts have been...