FAQ Database Discussion Community


CStdioFile::WriteString adding additional carriage return for line feeds?

c++,unicode,mfc
I just met the following behaviour: When using CStdioFile::WriteString, it will convert a \n to \r\n. I didn't note this behaviour in earlier versions of my code, only after I converted my project to Unicode. What am I missing? I tried this code: CStdioFile file; CFileException fileException; file.Open(TEXT("c:\\test.txt"), CFile::modeCreate |...

How is Levenshtein Distance calculated on Simplified Chinese characters?

python,string,unicode,levenshtein-distance,edit-distance
I have 2 queries: query1:你好世界 query2:你好 When i run this code using the python library Levenshtein: from Levenshtein import distance, hamming, median lev_edit_dist = distance(query1,query2) print lev_edit_dist I get an output of 12. Now the question is how is the value 12 derived? Because in terms of strokes difference, theres...

python 3, unicode conversion, two \u0000 as one character

string,python-3.x,unicode
My python3 script receives strings from c++ program via pipe. Strings encoded via Unicode code points. I need to decode it correctly. For example, consider string that contain cyrillic symbols: 'тест test' Try to encode this string using python3: print('тест test'.encode()). We got b'\xd1\x82\xd0\xb5\xd1\x81\xd1\x82 test' C++ program encodes this string...

Java Unicode Variable names Devanagari

java,unicode
I am trying to code a program where I am trying to provide the variable names in Devanagari Script of Sanskrit. EXample : class फिल्म { public static void main(String args[]) { String गीत = "Songs"; System.out.println(गीत); } } When I try to execute this Java code, then it gives...

How to make Python Interactive Shell print cyrillic symbols?

python,shell,unicode,character-encoding,cyrillic
I'm using Pymorphy2 in my project as a cyrillic morphological analyzer. But when I try to print out the list of words, I get this: >>> for t in terms: ... p = morph.parse(t) ... if 'VERB' in p[0].tag: ... t = p[0].normal_form ... elif 'NOUN' in p[0].tag: ... t...

Inserting unicode message using sqlalchemy and mysql

python,mysql,unicode,sqlalchemy
I am trying to insert a string that has smiley face emojis into a MySQL database. I have the following test that throws an exception. How do I get past this error? Here is the test: def test_write_unicode(self): db_schema = "testing" db_url = sqlalchemy.engine.url.URL(drivername='mysql', host=selah.db_host, database=db_schema, query={ 'read_default_file' : selah.db_config...

Representation of python dictionaries with unicode in database queries

python,unicode,orient-db
I have a problem that I would like to know how to efficiently tackle. I have data that is JSON-formatted (used with dumps / loads) and contains unicode. This is part of a protocol implemented with JSON to send messages. So messages will be sent as strings and then loaded...

Using a context processor in conjunction with Jinja template variables

python,unicode,flask,jinja2
I am in the midst of deploying Stripe and it requires that that payment values being passed into it are stated in "cents" rather than dollars. I can handle this on the backend (i.e I can process a payment for the appropriate amount) but in order to render properly in...

Selecting Unicode U+hex notation from database

php,mysql,select,unicode
I have a table containing some data, SELECT code, id FROM code WHERE id = 92; +--------+----+ | id | code| +--------+----+ | كتب عربية | 92 | +--------+----+ 1 row in set I need to be able to get this value from the SELECT-query, U+0643 U+062A U+0628 U+0639 U+0631...

Testing for unicode escaped strings

python,unicode,python-2.x
I have an array that looks like this data = [ { 'string': u'CN=Willian John sway\xc3\xa9rioGra\xc3\xa7a/[email protected]'}, { 'string': u'CN=E0999999.www.acme.com'} ] Some of the strings contain unicode escaped strings and some don't. I need to iterate over the array and unescape the unicode escaped strings. I tried doing this: for i...

java convert a english letter to unicode [closed]

java,unicode
I want to know how to create a program that converts a input english letter into a unicode decimal? For example, I enter the letter E, it will output 69. sth like that. I have already tried a simple casting char to int, but don't know how to create an...

Why Unicode characters are not displayed properly in terminal with GCC?

c,gcc,unicode
I've written a small C program: #include <stdio.h> #include <stdlib.h> #include <locale.h> int main() { wprintf(L"%s\n", setlocale(LC_ALL, "C.UTF-8")); wchar_t chr = L'┐'; wprintf(L"%c\n", chr); } Why doesn't this print the character ┐ ? Instead it prints gibberish. I've checked: tried compiling without setlocale, same result the terminal itself can print...

Python writing to CSV… TypeError: coercing to Unicode: need string or buffer, file found

python,osx,csv,unicode
outputList is a list of lists. [ [a,b,c], [d,e,f], [g,h,i] ] and I want to output it to a csv file with each list as a separate row. I'm getting this error TypeError: coercing to Unicode: need string or buffer, file found and I dont know why. Im using python...

Python array printing in unicode and csv issue

python,csv,unicode,unicode-string
dict1 is a dictionary with a corresponding array with 4 sample elements like this: {u'OlpyplEJ_c_hFxyand_Wxw': [u'Inchin Bamboo Garden', u'Paradise Valley', 33.575816, -111.926234], u'_qvxFHGbnbrAPeWBVifJEQ': [u"Lenny's Sub Shop", u'Charlotte', 35.334993, -80.8129717], u's5yzZITWU_RcJzWOgjFecw': [u"Sergio's Italian Gardens", u'Las Vegas', 36.100414, -115.1265829]} I am printing data using the business_id as the key for the above...

Font Awesome unicodes displaying as text

css,unicode,font-awesome
I'm trying to insert a FontAwesome icon via css using the Unicode, and my page is displaying it as text. So if the Unicode for a house icon is f015, my browser is displaying "/f015" instead of displaying a house icon. HTML header includes: <link rel="stylesheet" href="//maxcdn.bootstrapcdn.com/font-awesome/4.3.0/css/font-awesome.min.css"> CSS: #MenuBarLink1 a:before...

Is executing C++ code in comments with certain Unicode characters allowed, like in Java?

c++,c++11,unicode,comments
I know that executing Java code in comments with certain Unicode characters is allowed. Please see this question for further clarification Executing Java code in comments. So was just curious to know if C++ has such features?

Replace unicode characters with characters (Javascript)

javascript,unicode,unicode-string
Take for example the following string: &#8220;A profile of Mr. T, the A Team&#8217;s most well known member.&#8221; How do I use javascript replace the unicode character encodings and convert that to the following: "A profile of Mr. T, the A Team's most well known member."...

Handle windows-1252 and unicode in java [closed]

java,unicode,utf-8,character-encoding,bytearray
After a http request, I have got a byte array encoded with utf-8, e.g.: byte[] array = new byte[]{0xc3, 0xa4, 0xc2, 0x96} I decode the byte array using new String(array, "UTF-8"). In the example the first decoded char is 0xe4 which represents the letter ä in Unicode – so far...

erlang os:cmd() command with UTF8 binary

unicode,encoding,erlang,utf
I'm trying to get an Erlang function to execute a bash command containing unicode characters. For example, I want to execute the equivalent of: touch /home/jani/ჟანიweł I put that command in variable D, for example: io:fwrite("~ts", [list_to_binary(D)]). touch /home/jani/ჟანიwełok but after I execute: os:cmd(D) I get file called á??á??á??á??weÅ?. How...

Salesforce custom button returning 'Unexpected Token ILLEGAL?

javascript,unicode,salesforce
{!REQUIRESCRIPT("/soap/ajax/33.0/connection.js")} /*Getting the Running User Details*/ var result = sforce.connection.query( "SELECT Advisor__c " + "FROM User " + "WHERE Id = '{!$User.Id}'" ); /*Grab the Records returned by executing the above SOQL Query*/ var userDetails = result.getArray("records")[0]; /*Initializing the Contact record for Update*/ var contactToUpdate = new sforce.SObject("Contact"); contactToUpdate.Id =...

What happens under the hood when bytes converted to String in Java?

java,string,unicode,utf-8,byte
I have a problem when trying to convert bytes to String in Java, with code like: byte[] bytes = {1, 2, -3}; byte[] transferred = new String(bytes, Charsets.UTF_8).getBytes(Charsets.UTF_8); and the original bytes are not the same as the transferred bytes, which are respectively [1, 2, -3] [1, 2, -17, -65,...

write a grammar rule name in unicode [ANTLR 4]

java,parsing,unicode,antlr,antlr4
I am still a beginner in ANTLR 4 and I was wondering if there is a way to write a grammar rule name in unicode. For example, the following rule is fine: atomExp returns [double value] : n=Number {$value = Double.parseDouble($n.text);} | '(' exp=additionExp ')' {$value = $exp.value;} ; However,...

java linked list works except in one instance

java,unicode,junit
I am writing a Linked List in Java that is essentially a mixture of Java's String and StringBuilder classes. I have to test it with JUnit and all of the tests pass, except for the last two, where it passes in a string consisting of every character. In this case...

selenium webdriver - xpath locator not working if element's text contains Unicode Characters

selenium,xpath,unicode,selenium-webdriver,webdriver
I'm trying to select an option contained inside a menu. It's not a select menu, but it's styled to appear as such. Anyway, if the text contained inside the menu is in English, I can select it ok. Trouble is, the text I need to select is french so it...

How to map a arabic character to english string using python

python,unicode,arabic
I am trying to read a file that has Arabic characters like, 'ع ' and map it to English string "AYN". I want to create such a mapping of all 28 Arabic alphabets to English string in Python 3.4. I am still a beginner in Python and do not have...

ctrl+G in erl doesn't work

unicode,encoding,utf-8,erlang,docker
I'm trying to interconnect erlang nodes, but entering ctrl+G doesn not work: Eshell V6.4.1 (abort with ^G) 1> ^G Eshell V6.4.1 (abort with ^G) 1> ^G Eshell V6.4.1 (abort with ^G) 1> ^G Eshell V6.4.1 (abort with ^G) any idea why this can happen? I was thinking about locale settings,...

How to create a Persian file.txt and then explode it?

php,string,unicode,character-encoding,explode
I have a lot Persian text and I want explode it, I store my text in a file.txt. (So i have a file.text containing Persian text). Now my problem is charset. When i save the text into file.text, it give me a error: This file contains characters in Unicode format...

Formatting string with Regex in java, how do I convert captured group to special character?

java,regex,unicode,special-characters
I have a set of commands that include hidden characters, written in a text file. One by one, they get read and sent a server to execute commands. Its very important that special characters be formatted properly, however they cannot simply be written in the text file as "\u0002", for...

How to detect homographic text, unicode spoofing with node.js

node.js,unicode,punycode
Users can get their own subsites on ours, so that www.example.com/subsite/gary would then be a specific users subsite. However I am worried about the possibility of homographic / unicode spoofing attacks, where a malicious user creates an account with a different username but with unicode characters that will appear the...

Need code for removing all unicode characters in vb6

unicode,vb6,unicode-string
I need code for removing all unicode characters in a vb6 string.

Formatting Errors with tail

linux,bash,unicode,cygwin,tail
How can I correctly parse this file through tail, without formatting errors? I am using tail within cygwin to parse the last ten lines of two files. One file parses through correctly, the other contains a space between every character. $ tail file2.txt -n 4 22/06/2015 12:28 - Decompressing and...

Converting Unicode codepoints to UTF-8 in C using iconv

c,unicode,encoding,utf-8,iconv
I want to convert a 32-bit value, which represents a Unicode codepoint, into a sequence of chars which is the utf-8 encoded string containing only the character corresponding to the codepoint. For example, I want to turn the value 955 into the utf-8 encoded string "λ". I tried to do...

How does Python 2 represent Unicode internally?

python,unicode
When I read this Python2's official page on Unicode, it says Under the hood, Python represents Unicode strings as either 16-or 32-bit integers, depending on how the Python interpreter was compiled. What does above sentence mean? Could it mean that Python2 has its own special encodings of Unicode? If so,...

Font is right. Why can't I get this unicode character to display in this C# console app?

c#,.net,unicode,console-application
I have added a font to cmd. (DejaVu Sans Mono) This techrepublic link has the registry hack to add a font to cmd, one can do it for some fonts such as that one. The font has a unicode non ascii glyph and I can paste that unicode non ascii...

how to put characters combination as constant command into iOS framework

ios,objective-c,unicode,frameworks,ascii
I am new to iOS programming. I am developing some SDK framework right now. I have a command with three characters: 'ESC' 'E' '1', I wanna combine those three characters to generate a NSString and put this NSString into framework. Therefore others can directly use this Constant in framework. Any...

█ character string indexed in python

python,string,unicode,ascii,python-2.x
I'm trying to get the index of 'J' in a string that is similar to myString = "███ ███ J ██" so I use myString.find('J') but it returns a really high value and if I replace '█' by 'M' or another character of the alphabet I get a lower value....

Created unicode & unicode without whitespace generators in ScalaCheck

scala,unit-testing,unicode,scalacheck
During testing we want to qualify unicode characters, sometimes with wide ranges and sometimes more narrow. I've created a few specific generators: // Generate a wide varying of Unicode strings with all legal characters (21-40 characters): val latinUnicodeCharacter = Gen.choose('\u0041', '\u01B5').filter(Character.isDefined) // Generate latin Unicode strings with all legal characters...

JavaCC and Unicode issue. Why \u696d cannot be managed in JavaCC although it belong to the range “\u4e00”-“\u9fff”

java,unicode,compiler-construction,antlr,javacc
We're trying to use JavaCC as a parser to parse source code which is in UTF-8( the language is Japanese). In JavaCC, we have a declaration like: < #LETTER: [ "\u0024", "\u0041"-"\u005a", "\u005f", "\u0061"-"\u007a", "\u00c0"-"\u00d6", "\u00d8"-"\u00f6", "\u00f8"-"\u00ff", "\u0100"-"\u1fff", "\u3040"-"\u318f", "\u3300"-"\u337f", "\u3400"-"\u3d2d", "\u4e00"-"\u9fff", "\uf900"-"\ufaff" ] > If it meets a string...

Any way to write X^T with just unicode?

unicode
I've given up and am just using the x⁺ found here. But I really want it to look like this: Also the arrow above y will be a problem too but I'm hoping to be able to do that with diacritical marks (?). In case you are wondering what this...

Excel Function - Convert unicode to ascii

excel,unicode,excel-formula
Is there any function in Excel (2010) which decode unicode to ascii text? One column in my sheet contains word/sentences in Portuguese. For Example: Esse Jean é feio né Should become Esse Jean é feio né é - This letter in the text is not Portuguese, is there any function...

How can I add an icon to select box choices?

html,css,unicode
Basically I have a text box with a modifier dropdown. I would like an icon of the current choice to display when chosen. The problem with the current set-up (using UNICODE) is that the icons do not always display, such as on google chrome (unless the specific fonts have been...

PHP - length of string containing emojis/special chars

php,unicode,unicode-string
I'm building an API for a mobile application and I seem to have a problem with counting the length of a string containing emojis. My code: $str = "👍🏿✌🏿️ @mention"; printf("strlen: %d" . PHP_EOL, strlen($str)); printf("mb_strlen UTF-8: %d" . PHP_EOL, mb_strlen($str, "UTF-8")); printf("mb_strlen UTF-16: %d" . PHP_EOL, mb_strlen($str, "UTF-16")); printf("iconv...

Characters and Strings in Swift

ios,string,swift,unicode,character
Reading the documentation and this answer, I see that I can initialize a Unicode character in either of the following ways: let narrowNonBreakingSpace: Character = "\u{202f}" let narrowNonBreakingSpace = "\u{202f}" As I understand, the second one would actually be a String. And unlike Java, both of them use double quotes...

LXML to write in unicode?

python,unicode,lxml
I am currently using lxml to write a file. I build the node and then I write it to a file using etree.tostring(node, pretty_print=True). However, it seems to be using htmlencoding -- <Synopsis> Abila schlie&#223;lich die ersten sechs Aufgaben zu meistern. Wird der Junge auch </Synopsis> In order to decipher...

How do I match “i” with Turkish i in java?

java,unicode,normalization
I want to match the lower case of "I" of English (i) to lower case of "İ" of Turkish (i). They are the same glyph but they don't match. When I do System.out.println("İ".toLowerCase()); the character i and a dot is printed(this site does not display it properly) Is there a...

How can I display unicode in QGraphicsTextItem? [duplicate]

qt,unicode,qgraphicsitem,qgraphicstextitem
This question already has an answer here: How to specify a unicode character using QString? 2 answers I would like to be able to display Unicode in QGraphicsTextItem (or a subclass of it). The only way to set text in QGraphicsTextItem seems to be setPlainText(text); Trying setPlainText(QString::fromUtf8("Caf\x00e9 Frap\x00e9")); or...

PHPExcel file “corrupt” when Unicode in data

excel,unicode,phpexcel
I've been using PHPExcel successfully for months now. Now I have users that are entering Unicode into their data, and now their reports are not opening in Excel. The exact error that Excel 2010 gives is: Excel found unreadable content in 'blahblahblah.xlsx'. Do you want to recover the contents of...

How do I search for all rows containing a given unicode character in Postgres

sql,postgresql,unicode
I am after all cells containing the 'LINE SEPARATOR' (U+2028) unicode point. Normally this is encoded as \u+2028 or something similar. However googling how this translates to SQL has given various options none of which seem to work ((N'2028'), set @hexstring = '2028';, vchar(2028)) SELECT * FROM myTable WHERE desc...

Handling count of characters with diacritics in R

r,unicode,character-encoding,nlp,linguistics
I'm trying to get the number of characters in strings with characters with diacritics, but I can't manage to get the right result. > x <- "n̥ala" > nchar(x) [1] 5 What I want to get is is 4, since n̥ should be considered one character (i.e. diacritics shouldn't be...

Create unicode character with pack

perl,unicode
I am trying to understand how Perl handles unicode. use feature qw(say); use strict; use warnings; use Encode qw(encode); say unpack "H*", pack("U", 0xff); say unpack "H*", encode( 'UTF-8', chr 0xff ); Output: ff c3bf Why do I get ff and not c3bf when using pack ?...

How to specify string variables as unicode strings for pattern and text in regex matching?

regex,python-2.7,unicode
>>> import re >>> re.match(u'^[一二三四五六七]、', u'一、') If the pattern and the text are stored in variables (for example, they were read from text files), >>> myregex='^[一二三四五六七]、' >>> mytext='一、' How shall I specify myregex and mytext to re.match, in the same way as re.match(u'^[一二三四五六七]、', u'一、')? Thanks....

Change lowercase and uppercase of characters in java

java,unicode,mapping,uppercase,lowercase
If I want to create a dictionary where the user can create a custom alphabet (that still uses unicode) Is there a way to change lowercase and uppercase mapping of the characters? Let's say I want the lowercase of 'I' to be 'ı' instead of 'i' or upperCase of 'b'...

Convert hex values to characters

perl,unicode
I have strings like: "Film-DVD \x{bb}Once / The Swell Season (Collector's Edition\x{ab} John Carney" which are the result of Data::Dumper. Now I want the hex-values \x{bb}, \x{ab} to be replaced with corresponding characters » and «. I already tried: $a =~ s/\\x\{(.{2})\}/chr(hex($1))/eg; But this returns me "Film-DVD �Once / The...

PHP - SQL query containing unicode is returning NULL for some reason

php,mysql,unicode,null
I am trying to run this query: SELECT trans FROM `dictionary` WHERE `word` LIKE 'Çiçek' like this (relevant code): function gettranslation($word){ return $this->query("SELECT trans FROM `dictionary` WHERE `word` LIKE '$word'"); } function query($query){ $result=mysqli_query($this->conn, "set character_set_results='utf8'"); $result=mysqli_query($this->conn, $query); return $row = mysqli_fetch_row($result)[0]; } My mySQL table is made like this:...

How to find UTF-8 reference of a composite unicode character

unicode,encoding,utf-8,character-encoding
At work, i have this issue where i need to find the UTF-8 reference of a composite unicode character. The character in question is a "n" with a "^" on top : n̂. This is represented in unicode by the character "n" (U+006E) followed by the circumflex accent (U+0302). What...

how to print unicode character U-1F4A9 'pile of poo' in ruby

ruby,unicode,rubymine-7
I am trying to print a unicode character in Ruby, specifically the pile of poo. It has a unicode value of U-1F4A9. But when I try to print "\u1F4A9" to the output or a file, I see nothing. Do I need to print to a specific type of file to...

How to pickle and unpickle to portable string in Python 3

python,python-3.x,serialization,unicode
I need to pickle a Python3 object to a string which I want to unpickle from an environmental variable in a Travis CI build. The problem is that I can't seem to find a way to pickle to a portable string (unicode) in Python3: import os, pickle from my_module import...

How can I type an actual lamda character in Visual Studio?

visual-studio,unicode
I'm talking about this guy, right here: λ In Microsoft Word, you can type 03BB followed by ALT+X to get that character. This does not work in Visual Studio 2013. Any ideas? To specify: I intend to enter the 'λ' character as part of C# source code, not as part...

Python: difficulty converting ascii to unicode

python,unicode,encoding,utf-8
My goal: get the page source from a url and count all instances of a keyword within that page source How I am doing it: getting the pagesource via urllib2, looping through each char of the page source and comparing it to the keyword My problem: my keyword is encoded...

Tabulating characters with diacritics in R

r,unicode,nlp,linguistics
I'm trying to tabulate phones (characters) occurrences in a string, but diacritics are tabulated as characters on their own. Ideally, I have a wordlist in International Phonetic Alphabet, with a fair amount of diacritics and several combinations of them with base characters. I give here a MWE with just one...

Inserting French character into Oracle gets converted into some junk characters

oracle,unicode,plsqldeveloper
Using PL/SQL Developer, I'm able to insert French character in my Oracle database without any error. Querying: SELECT * FROM nls_database_parameters WHERE parameter = 'NLS_NCHAR_CHARACTERSET'; Output: AL16UTF16 But when i retreive the data using select statement it get converted into some junk characters, For eg: système gets converted to système...

Python 3 UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d

python,unicode
I want to make search engine and I follow tutorial in some web. I want to test parse html from bs4 import BeautifulSoup def parse_html(filename): """Extract the Author, Title and Text from a HTML file which was produced by pdftotext with the option -htmlmeta.""" with open(filename) as infile: html =...

Search for unicode values in character string

r,unicode,grep,gsub
I am trying to identify unique unicode values in a data frame composed of character strings. I have tried using the grep function, however I encounter the following error Error: '\U' used without hex digits in character string starting ""\U" A example data frame time sender message 1 2012-12-04 13:40:00...

Export (Android/Java) string data in with extended characters for import into Excel

java,android,excel,unicode,utf-8
I need to export string data that includes the 'degrees' symbol ("\u00B0"). This data is exported as a csv text file with UTF-8 encoding. As would be expected, the degrees symbol is encoded as two characters (0xC2, 0xB0) within the java (unicode) string. When the CSV file is imported into...

Why/how does the browser decide ☃.net goes to xn--n3h.net

url,browser,unicode,character-encoding,iri
If we type into firefox or chrome http://☃.net/ It takes us to http://xn--n3h.net/ Which is a mirror of unicodesnowmanforyou.com What I don't understand is by what rules the unicode snowman can decode to xn--n3h, it doesn't look anything like utf-8 or urlencoding. I think I found a hint while mucking...

Need to convert Java String with è to \u00E8 using Java

java,unicode
I have a Java String object which contains a word like "resumè" or for that matter any word with any international character in it. What I want to do is to convert this to encode the non ASCII characters in an ASCII string like "resum\u00E8". How do I do this...

BeautifulSoup parsing unicode giving variable results

python-2.7,unicode,beautifulsoup,ipython-notebook
I am trying to to parse the following ipython notebook however I am getting varying results when I read the unicode into a BeautifulSoup object, i.e. from IPython.nbconvert.exporters import HTMLExporter from IPython.config import Config from bs4 import BeautifulSoup filepath = '2015-05-01_test2.ipynb' config = Config({'CSSHTMLHeaderTransformer': {'enabled': True, 'highlight_class': '.highlight-ipynb'}}) exporter =...

Working with characters based on their UTF-8 hex codes

javascript,jquery,unicode,utf-8
I'm working on something that will read a user's text messages and export them to a csv file, which they can then download. The messages are being retrieved from a third-party web interface—I am essentially using js to grab the html of each message and compiling it as needed. The...

Parsing string containing Unicode character names

python,unicode
I have a string >>> s u'M\\N{AMPERSAND}M\\N{APOSTROPHE}s' >>> print s M\N{AMPERSAND}M\N{APOSTROPHE}s How do I turn it into M&M's?...

Allow whitespace, unicode letters, digits, underscore, dash AND comma?

php,regex,unicode,preg-match
I'm pretty new at the subject preg and I'm using this preg_match condition to check if the user has entered whitespace, unicode letters, digits, underscore or dash: if(preg_match("/[^\040\pL\pN_-]/u", $term)) { But now I wanted to allow a comma. So I tried this: if(preg_match("/[^\040\pL\pN,_-]/u", $term)) { And it actually works and...

Java Unicode character displaying box instead of Runic letter

java,string,unicode,graphics
I am trying to draw random Nordic runes in a little Java game, but all I'm getting back is a square character. public class MyComponent extends JComponent { public void paintComponent(Graphics g) { String s = "\u16A8"; g.drawString(s,50,50); } } What the character should be displaying: https://en.wikipedia.org/wiki/Ansuz_(rune) What it's actually...

Perl internal representation of unicode string

perl,unicode,encoding,mojolicious
I am working on a perl + Mojolicious web application and my front-end send a POST query containing accents in a "a" parameter ("été") using charset utf-8 as I can spy in chrome network tab. But server side script decode that parameter using a charset that I didn't expect. I...

Why python 2.7 on Windows need a space before unicode character when print?

python,python-2.7,unicode
I use cmd Windows, chcp 65001, this is my code: print u'\u0110 \u0110' + '\n' Result: (a character cmd can't display) (character what i want) Traceback (most recent call last): File "b.py", line 26, in <module> print u'\u0110 \u0110' IOError: [Errno 2] No such file or directory But, when i...

Why is executing Java code in comments with certain Unicode characters allowed?

java,unicode,comments
The following code produces the output "Hello World!". (No really, try it) public static void main(String... args) { // The comment below is no typo. // \u000d System.out.println("Hello World!"); } The reason for this is that the Java compiler parses the Unicode character \u000d as a new line and gets...

Using sprintf with unicode characters

c,unicode
I wanted to print out depictions of playing cards using Unicode. Code snippet: void printCard(int card){ char strCard[10]; sprintf(strCard, "\U0001F0A%x", (card%13)+1); printf("%s\n", cardStr); } Since the \U requires 8 hex characters after it I get the following from compiling: error: incomplete universal character name \U0001F0A I could create a bunch...

unicode converting in RestTemplate in Spring

java,spring,unicode,utf-8
My aim is getting user info by accessToken using facebook api.I get response but email in this response is like this: aaaaaa\u0040mail.com. For converting i add some properties but this doesn't work RestTemplate restTemplate = new RestTemplate(); restTemplate.getMessageConverters().add(0, new StringHttpMessageConverter(Charset.forName("UTF-8"))); String facebook = restTemplate.getForObject( "https://graph.facebook.com/me?access_token=" + facebookAccessToken, String.class); How can...

Unicode in Android

android,unicode
i have a little problem by printing the complement symbol in android. char c = '\u2216'; // should be the unicode for complement textView2.setText(c); // gives out "" nothing // if i take c = '\u2229' // it works But why i cant print out the complement symbol, where is...

Python: Transform a unicode variable into a string variable

python,unicode,casting,web-crawler,unicode-string
I used a web crawler to get some data. I stored the data in a variable price. The type of price is: <class 'bs4.element.NavigableString'> The type of each element of price is: <type 'unicode'> Basically the price contains some white space and line feeds followed by: $520. I want to...

Open File with Non ASCII Characters

c++,visual-studio-2010,unicode
I am trying to compute SHA-256 of file. I have the following code that gives correct value of Checksum when the path is valid ie. It is ASCII. I have the following code: #include <openssl\evp.h> #include <sys\stat.h> #include <iostream> #include <string> #include <fstream> #include <cstdio> const int MAX_BUFFER_SIZE = 1024;...

mysql() refuses to read characters in php

php,mysql,unicode
I copied some code from word into my database which is causing problems when I want to retrieve them with mysql() in php. I've identified what the problem is, dashes like this one is causing an error retrieving it: Tue – Thur: I have to convert the dash to the...

Why can't I add a Dog Face (u+1f436) field to my object without using a String? [duplicate]

javascript,unicode
This question already has an answer here: What characters are valid for JavaScript variable names? 12 answers This: var foo = { 🐶 : true //Truely adorable }; Gives me an Illegal Character error on Firefox and Chrome. However, var foo = { '🐶' : true }; Works perfectly....

Displaying unicode characters in Python 3

python,list,unicode
I have issues with displaying Unicode characters. As an output I have this list (only on online IDEs): [u'\u0413', u'\0434', u'\043b'] How can I convert this sequence to normally visible text? I have # -*- coding: utf-8 -*- in header and also each string marked as Unicode like u'String' I...

Punycode for Unicode query parameter

java,url,unicode,punycode
I am trying encode some Unicode URLs with Punycode. These URLs have a query parameter that contains non-ASCII characters, for example: https://en.wiktionary.org/w/index.php?title=Clœlia&printable=yes The problem is, when I try to do it in Java, the resulting URL is wrong: String link = "https://en.wiktionary.org/w/index.php?title=Clœlia&printable=yes"; link = IDN.toASCII(link); // -> link = http://en.wiktionary.org/w/index.xn--php?title=cllia&printable=yes-hgf...

Convert multichar %xx escapes to unicode

python,unicode,urllib
In the middle of writing this I got this to work. Here it is anyway in case it's useful or the solution is less than optimal. I have a unicode string u'http://en.wikipedia.org/wiki/Espa%C3%B1ol' from which I'd like to have u'http://en.wikipedia.org/wiki/Español'. My attempt using urllib.unquote gives me u'http://en.wikipedia.org/wiki/Espa\xc3\xb1ol'....

Scala Unicode Syntax

scala,unicode,intellij-idea,sbt
I know that these two are equivalent in Scala: for {x <- xs} yield x case Nil => println("foo") Note the Unicode replacement for <- and =>: for {x ← xs} yield x case Nil ⇒ println("foo") What is this feature called? I googled various combinations of "Scala Unicode Operators/Symbols"...

Python length of unicode string confusion

python,unicode
There's been quite some help around this already, but I am still confused. I have a unicode string like this: title = u'😉test' title_length = len(title) #5 But! I need len(title) to be 6. The clients expect it to be 6 because they seem to count in a different way...

Why does Python 3 output \xe3, an extra char?

python,python-3.x,unicode,utf-8
Why does Python add \xe3 in the output of: >>> b'Transa\xc3\xa7\xc3\xa3o'.decode('utf-8') 'Transaç\xe3o' Expected value is: 'Transação' Some more information about my environment: >>> import sys >>> print (sys.version) 3.4.3 (v3.4.3:9b73f1c3e601, Feb 24 2015, 22:44:40) [MSC v.1600 64 bit (AMD64)] >>> sys.stdout.encoding 'cp437' This was under Console 2 + Powershell....

Is there any way to check whether an ordinal position contains a character or is empty?

string,python-2.7,unicode
If I print a character using its ordinal number with unichr(orninal) and the ordinal position does not contain any valid character, the result will be as: >>> print unichr(0x0c80) ಀ Now I want to filter such null characters from a string and I tried str.encode('utf-8', errors='ignore') as: >>> print ''.join([unichr(i)...

Create File In Linux With Unicode File Name

python,linux,windows,unicode
I created a Python script to read a email file using "email" module and extract its attachments to file system, zip the extracted files and email the Zip file to someone. The attachments may have Unicode file name such as Chinese or Japanese. I found the the module "email.header.decode_header" can...

How to display Arabic unicode text in page that retrieved from database

java,unicode,utf-8,xhtml,arabic
I need your help in displaying some Arabic text which is stored in a variable in the xhtml page. I have configured my project in jdeveloper to include UTF-8 in the properties and the Arabic text is displayed correctly. I have a variable called bankName and it has the unicode...

Text only getting correct UNICODE if i echo an arabic text before CURL

php,curl,unicode
I have a php code that is supposed to curl a json file that contains arabic text and show it. the code wont show the arabic text with the right unicode unless i echo an arabic text ( even if a single letter ) before the curl statement The Working...

How do I compare each character of a String while accounting for characters with length > 1?

java,string,unicode,character-encoding,utf-16
I have a variable string that might contain any unicode character. One of these unicode characters is the han 𩸽. The thing is that this "han" character has "𩸽".length() == 2 but is written in the string as a single character. Considering the code below, how would I iterate over...

Cyrllic characters in SVG font

javascript,svg,unicode,encoding,fonts
You will have to bear with me with this, as I know very little about encodings so may be asking something very simple/obvious. I am working with some SVG fonts in browser-side Javascript and I need to pull out some information on Cyrillic characters grammatically. I am doing this currently...

Haskell: quoteFile fails on text file with “invalid byte sequence” on unicode characters

linux,haskell,unicode,encoding,utf-8
I'm facing issue with quoteFile in my virtual environment (Debian Wheezy with GHC 7.8.4 installed). I have described file oriented version of st quasi quoter from Text.Shakespeare.Text: import Language.Haskell.TH.Quote (QuasiQuoter, quoteFile) import Text.Shakespeare.Text (st) sfFile :: QuasiQuoter stFile = quoteFile st This works very well on my host machine, however,...

fonts - how does the text gets displayed when there is no font associated to it

unicode,fonts
The data encoding is UTF8 or Unicode. Plain data does not have any font attached to it. If an editor would not have any font support how would it display the data.

Maya Python object assigned as a listType, but won't convert to string?

python,string,list,unicode,maya
I'm starting to use objects a bit more to store Maya commands in Python. This is super useful! But I've run into a problem. Sometimes objects get commands that return Unicode lists rather than a string. Even using str() doesn't work. Code: cubeParent = cmds.polyCube(sx=10, sy=15, sz=5, h=20) print cubeParent...

Behaviour unicode string in python

python,unicode
I have seen this question I have doubts about how can I convert a var to unicode on running time ? Is it right use unicode function ? Are there other way to convert a string on running time ? print(u'Cami\u00f3n') # prints with right special char name=unicode('Cami\u00f3n') print(name) #...

Convert unicode URL to ASCII

php,unicode,encoding,ascii
I'm writing a PHP application that accepts an URL from the user, and then processes it with by making some calls to binaries with system()*. However, to avoid many complications that arise with this, I'm trying to convert the URL, which may contain Unicode characters, into ASCII characters. Let's say...

Calculate bytes of the unicode character in python

python,unicode,byte
I'm writing a Python script to read Unicode characters from a file and insert them into a database. I can only insert 30 bytes of each string. How do I calculate the size of the string in bytes before I insert into the database?

Android Notifications Locale specific message with Unicode

android,unicode,locale
Hi I want to use local regional language text in notification , i could send the unicode of text and it works properly where device supports the language. but to support on all the devices i want to set the type face (custom fonts) to notification text..I tried with RemoteView...