C++ mbrtowc()

The mbrtowc() function in C++ converts a narrow multibyte character to a wide character (of type wchar_t).

The mbrtowc() function is defined in <cwchar> header file.

mbrtowc() prototype

size_t mbrtowc( wchar_t* pwc, const char* s, size_t n, mbstate_t* ps );

The mbrtowc() function converts the multibyte character represented by s to a wide character and is stored in the address pointed to by pwc.

  • If s is not a null pointer, a maximum of n bytes starting from the byte pointed to by s are examined in order to determine the number of bytes necessary to complete the next multibyte character (including any shift sequences).
    If the next n multibyte character in s is complete and valid, the function converts it to the corresponding wide character and is stored in the location pointed to by pwc.
  • If s is a null pointer, the parameters n and pwc has nothing to do with the function call and the call is equivalent to std::mbrtowc(NULL, "", 1, ps).
  • If the wide character produced is a null character, the conversion state stored in *ps is the initial shift state.

mbrtowc() Parameters

  • pwc: Pointer to the memory address where the converted wide character is stored.
  • s: Pointer to the multibyte character to convert.
  • n: Maximum number of bytes in s to examine.
  • ps: Pointer to the conversion state used when interpreting the multibyte string

mbrtowc() Return value

The mbrtowc() function returns the first of the following that is valid:

  • 0 if the wide character converted from s is null (if pwc is not null).
  • The number of multibyte character successfully converted from s.
  • -2 if the next n bytes doesn't represent a complete multibyte character.
  • -1 is encoding error occurs, errno is set to EILSEQ.

Example: How mbrtowc() function works?

#include <cwchar>
#include <clocale>
#include <iostream>
using namespace std;

void test_mbrtowc(const char *s, size_t n)
{
	mbstate_t ps = mbstate_t();
	wchar_t wc;
	int retVal = mbrtowc(&wc, s, n, &ps);
	
	if (retVal == -2)
		wcout << L"Next " << n << L" byte(s) doesn't represent a complete multibyte character" << endl;
	else if (retVal == -1)
		wcout << L"Next " << n << L" byte(s) doesn't represent a valid multibyte character" << endl;
	else if (retVal == 0)
		wcout << L"The converted wide character is a null wide character" << endl;
	else
	{
		wcout << L"Next " << n << L" byte(s) hold " << retVal << L" bytes of multibyte character, ";
		wcout << L"Resulting wide character is " << wc << endl;
	}
}

int main()
{
	setlocale(LC_ALL, "en_US.utf8");
	
	char str1[] = "\u00b5";
	char str2[] = "\0";
	
	test_mbrtowc(str1, 1);
	test_mbrtowc(str1, 5);
	test_mbrtowc(str2, 5);
	
	return 0;
}

When you run the program, the output will be:

Next 1 byte(s) doesn't represent a complete multibyte character
Next 5 byte(s) hold 2 bytes of multibyte character, Resulting wide character is µ
The converted wide character is a null wide character