Hi,
I have a NW6.5Sp5 server that at least once a week reboots on its own
without ever creating an abend.log file or presenting any indication of
the cause of the failure, it just reboots and several days later it does
it again.

The first thing I did when this happened was to check my SYS$LOG.ERR,
abend.log and any other log file that I could find but noticed nothing at
all so I set the auto restart after abend from 1 (default) to 0 but the
server rebooted again a few days later.

One of the things I noticed was that prior to the server rebooting it
would invariably cause the replica-ring master-server to loose one of its
network card connections (I have implemented the Intel IANS fault
tolerance driver in the master server) which would make the secondary card
kick-in. This led me to implement a similar setup on the offending server
but to no avail (it continued abending) so I enabled the conlog utility
and during some of the last two system failures the console log file
captured "stuff" in the last entries of the console log (see Failure
entries below) which led me to believe that my server has been the victim
of some sort of attack.

Based on the entries below I feel as though someone has been toying with
my system causing it to fail so I locked it down following the tips and
advice on the OES services security document by Thomas Erickson
(http://www.novell.com/coolsolutions/...s_security.pdf)
disabling all of the web services not being used except file and print
services but today it happened again.

Is there anything else that someone could think of that I am missing? Im I
correct to assume that my system has been compromised based on the data
below or am I just jumping the gun sort-of-speak.

Any help would and advice would be greatly appreciated.

Many Thanks,
Carls S.

Failure 1
eeFreeFreeFreeFreeFreeFreeFreeFreeFreeFreeFreeFree FreeFreeFreeFreeFreeFreeF
reeFreeFreeFreeFreeFreeFreeFreeFreeFreeFreeFreeFre eFreeFreeFreeFreeFreeFree
FreeFreeFreeFreeFreeFreeFreeFreeFreeFreeFreeFreeFr eeFreeFreeFreeFreeFreeFre
eFreeFreeFreeFreeFreeFreeFreeFreeFreeFreeFreeFreeF reeFreeFreeFreeFreeFreeFr
eeFreeFreeFreeFreeFreeFreeFreeFreeFreeFreeFreeFree FreeFreeFreeFreeFreeFreeF
reeFreeFreeFreeFreeFreeFreeFreeFreeFreeFreeFreeFre eFreeFreeFreeFreeFreeFree
FreeFreeFreeFreeFreeFreeFreeFreeFreeFree


Failure 2
r the
first character of the next match, depending on how you like
to look at it). Each string has its own pos() value.

Suppose you want to match all of consective pairs of digits
in a string like "1122a44" and stop matching when you
encounter non-digits. You want to match C<11> and C<22> but
the letter <a> shows up between C<22> and C<44> and you want
to stop at C<a>. Simply matching pairs of digits skips over
the C<a> and still matches C<44>.

$_ = "1122a44";
my @pairs = m/(\d\d)/g; # qw( 11 22 44 )

If you use the \G anchor, you force the match after C<22> to
start with the C<a>. The regular expression cannot match
there since it does not find a digit, so the next match
fails and the match operator returns the pairs it already
found.

$_ = "1122a44";
my @pairs = m/\G(\d\d)/g; # qw( 11 22 )

You can also use the C<\G> anchor in scalar context. You
still need the C<g> flag.

$_ = "1122a44";
while( m/\G(\d\d)/g )
{
print "Found $1\n";
}

After the match fails at the letter C<a>, perl resets pos()
and the next match on the same string starts at the beginning.

$_ = "1122a44";
while( m/\G(\d\d)/g )
{
print "Found $1\n";
}

print "Found $1 after while" if m/(\d\d)/g; # finds "11"

You can disable pos() resets on fail with the C<c> flag.
Subsequent matches start where the last successful match
ended (the value of pos()) even if a match on the same
string as failed in the meantime. In this case, the match
after the while() loop starts at the C<a> (where the last
match stopped), and since it does not use any anchor it can
skip over the C<a> to find "44".

$_ = "1122a44";
while( m/\G(\d\d)/gc )
{
print "Found $1\n";
}

print "Found $1 after while" if m/(\d\d)/g; # finds "44"

Typically you use the C<\G> anchor with the C<c> flag
when you want to try a different match if one fails,
such as in a tokenizer. Jeffrey Friedl offers this example
which works in 5.004 or later.

while (<>) {
chomp;
PARSER: {
m/ \G( \d+\b )/gcx && do { print "number: $1\n"; redo; };
m/ \G( \w+ )/gcx && do { print "word: $1\n"; redo; };
m/ \G( \s+ )/gcx && do { print "space: $1\n"; redo; };
m/ \G( [^\w\d]+ )/gcx && do { print "other: $1\n"; redo; };
}
}

For each line, the PARSER loop first tries to match a series
of digits followed by a word boundary. This match has to
start at the place the last match left off (or the beginning
of the string on the first match). Since C<m/ \G( \d+\b
)/gcx> uses the C<c> flag, if the string does not match that
regular expression, perl does not reset pos() and the next
match starts at the same position to try a different
pattern.

=head2 Are Perl regexes DFAs or NFAs? Are they POSIX compliant?

While it's true that Perl's regular expressions resemble the DFAs
(deterministic finite automata) of the egrep(1) program, they are in
fact implemented as NFAs (non-deterministic finite automata) to allow
backtracking and backreferencing. And they aren't POSIX-style either,
because those guarantee worst-case behavior for all cases. (It seems
that some people prefer guarantees of consistency, even when what's
guaranteed is slowness.) See the book "Mastering Regular Expressions"
(from O'Reilly) by Jeffrey Friedl for all the details you could ever
hope to know on these matters (a full citation appears in
L<perlfaq2>).

=head2 What's wrong with using grep in a void context?

The problem is that grep builds a return list, regardless of the context.
This means you're making Perl go to the trouble of building a list that
you then just throw away. If the list is large, you waste both time and
space.
If your intent is to iterate over the list, then use a for loop for this
purpose.

In perls older than 5.8.1, map suffers from this problem as well.
But since 5.8.1, this has been fixed, and map is context aware - in void
context, no lists are constructed.

=head2 How can I match strings with multibyte characters?

Starting from Perl 5.6 Perl has had some level of multibyte character
support. Perl 5.8 or later is recommended. Supported multibyte
character repertoires include Unicode, and legacy encodings
through the Encode module. See L<perluniintro>, L<perlunicode>,
and L<Encode>.

If you are stuck with older Perls, you can do Unicode with the
C<Unicode::String> module, and character conversions using the
C<Unicode::Map8> and C<Unicode::Map> modules. If you are using
Japanese encodings, you might try using the jperl 5.005_03.

Finally, the following set of approaches was offered by Jeffrey
Friedl, whose article in issue #5 of The Perl Journal talks about
this very matter.

Let's suppose you have some weird Martian encoding where pairs of
ASCII uppercase letters encode single Martian letters (i.e. the two
bytes "CV" make a single Martian letter, as do the two bytes "SG",
"VS", "XX", etc.). Other bytes represent single characters, just like
ASCII.

So, the string of Martian "I am CVSGXX!" uses 12 bytes to encode the
nine characters 'I', ' ', 'a', 'm', ' ', 'CV', 'SG', 'XX', '!'.

Now, say you want to search for the single character C</GX/>. Perl
doesn't know about Martian, so it'll find the two bytes "GX" in the "I
am CVSGXX!" string, even though that character isn't there: it just
looks like it is because "SG" is next to "XX", but there's no real
"GX". This is a big problem.

Here are a few ways, all painful, to deal with it:

$martian =~ s/([A-Z][A-Z])/ $1 /g; # Make sure adjacent ``martian''
# bytes are no longer adjacent.
print "found GX!\n" if $martian =~ /GX/;

Or like this:

@chars = $martian =~ m/([A-Z][A-Z]|[^A-Z])/g;
# above is conceptually similar to: @chars = $text =~ m/(.)/g;
#
foreach $char (@chars) {
print "found GX!\n", last if $char eq 'GX';
}

Or like this:

while ($martian =~ m/\G([A-Z][A-Z]|.)/gs) { # \G probably unneeded
print "found GX!\n", last if $1 eq 'GX';
}

Here's another, slightly less painful, way to do it from Benjamin
Goldberg:

$martian =~ m/
(?!<[A-Z])
(?:[A-Z][A-Z])*?
GX
/x;

This succeeds if the "martian" character GX is in the string, and fails
otherwise. If you don't like using (?!<), you can replace (?!<[A-Z])
with (?:^|[^A-Z]).

It does have the drawback of putting the wrong thing in $-[0] and $+[0],
but this usually can be worked around.

=head2 How do I match a pattern that is supplied by the user?

Well, if it's really a pattern, then just use

chomp($pattern = <STDIN>);
if ($line =~ /$pattern/) { }

Alternatively, since you have no guarantee that your user entered
a valid regular expression, trap the exception this way:

if (eval { $line =~ /$pattern/ }) { }

If all you really want to search for a string, not a pattern,
then you should either use the index() function, which is made for
string searching, or if you can't be disabused of using a pattern
match on a non-pattern, then be sure to use C<\Q>...C<\E>, documented
in L<perlre>.

$pattern = <STDIN>;

open (FILE, $input) or die "Couldn't open input $input: $!; aborting";
while (<FILE>) {
print if /\Q$pattern\E/;
}
close FILE;

=head1 AUTHOR AND COPYRIGHT

Copyright (c) 1997-2002 Tom Christiansen and Nathan Torkington.
All rights reserved.

This documentation is free; you can redistribute it and/or modify it
under the same terms as Perl itself.

Irrespective of its distribution, all code examples in this file
are hereby placed into the public domain. You are permitted and
encouraged to use this code in your own programs for fun
or for profit as you see fit. A simple comment in the code giving
credit would be courteous but is not required.
adKey module from CPAN or use the sample code in
L<perlfunc/getc>.

If your system supports the portable operating system programming
interface (POSIX), you can use the following code, which you'll note
turns off echo processing as well.

#!/usr/bin/perl -w
use strict;
$| = 1;
for (1..4) {
my $got;
print "gimme: ";
$got = getone();
print "--> $got\n";
}
exit;

BEGIN {
use POSIX qw(:termios_h);

my ($term, $oterm, $echo, $noecNetWare Loadable Module
MPCONST.NLM  5 *7 ` n U  U  c l 
n $ ( H $Perl 5.8.4 - ModPerl-Const Extension
LONGnone Const_p VeRsIoN#  
CoPyRiGhT=@Copyright (C) 2000-01, 2004-05 Novell, Inc. All Rights
Reserved. MeSsAgEs ,U

CuStHeAd T d CyGnUsEx U< P8 P4
P]UWVS}= EP= t6E0PE,PE(PE$PE
PEPEPEPEPE PW ` -` Pj h` 
 E` Ed E h h ]Sh  h h h h E
PWƃ0
 j ƃ =H t>E0PE,PE(PE$PE
PEPEPEPEPE PWƃ0 =T tb j
Ãu 0{WjSuS 0QX
\ =D t8h h  E(PE$PE PEPEPEPEPE PW
ƃ0VEPh  e[^_]ÐUS@ P=T t'
Pj\ P\ P =L t!@ P
S j ]]ÐUVS1=P t @ P
ƃSe[^] UWVS uF@N F@))
E x@
 @E EK5 F \ C t+ 
RR @E Ph tP ES ]P۸  MP
ɋEtvUPҸ tYPEh Ph` V0h
h  h V
V F e[^_]Í ' + *' + 냉'
+ e , D PPEjPSV' E@ Pj
PPEh@ PhK VPVÅt
@ PMj SShR QhK VER PV^
jEPPV) UWVS}]WU$
PWSR E x
 PPEWPU PEWESP EtK E]jVWPS {
tUQEQjVWRSP MP  @ @BEe[^_]Ít&
' @ @uk
UWVS] }juSW1 jEh S
1 E` E8-tUQjPRNJ<:tI<&t5 ]
MVuVWSQ e [^_]
Fȍ ' FV1ӉÃ t
FPEPEWPEP6 u뎍v ' E 3 6
UWVSE } O 7@4 @, X,G@G@))
@E {P E UD@
 @ED E 9}D& Uڋ@
tB @E PCEPEPW#9}O NjO UDe[^_]Ív
jEPPWjEPEDPWEO hE 8PSh
WO  UVS] u<A{ Aw $x RRh
St*t& PSh VE ue[^]E  PPh
StZPPh St:PPh StPPh SuE
E E {E  oPPh StPPh
S,E  5QQh St$RRh S
E  E  PPh
 S PPh Sj PPh SF PPh"
S" PPh+ S PPh6 S QQh>
S RRhF S PPhM Sn PPhU
SJ PPh] S& PPhg S PPhq
S PPh{ S QQh S RRh
Sr PPh SN PPh* S* PPh
S PPh S PPh S PPh
S QQh Sv RRh SR PPh
S. PPh S
 PPh S PPh S PPh  S
PPh Sz QQh SN RRh S*
PPh" S PPh* S PPh1 S
PPh8 S PPh? Sb PPhF S
QQhL S RRhS S PPhZ S
 PPha S* PPhg S PPhp S
e PPh} SA PPh S QQh S
 RRh S PPh S PPh S
 PPh StmPPh SPPh S
 PPh StQQh SE D E K
E 3 E S E < E 6 E 5 xE $
lE = `E & TE ? HE  <E ! 0E
$ ' E
E & E  E E C
E  E A & E  E  E 
E  wE  kE  _E ~ SE 8N GE 7N
;E 6N /E 5N #E 4N E 3N E 1N E 0N
E /N E .N E -N E ,N E +N E *N
E )N E (N E 'N E &N {E %N oE $N
cE "N WE !N KE @ ?RRh S`PPh
SPPh SPPh' SdPPh<
SPPhQ S PPha SQQhs
SRRh~ SP PPh SPPh
SPPh SPPh* S PPh
S PPh S QQh S\ RRh
S8 PPh S PPh S PPh
S PPh  S PPh S
PPh! S` QQh- S  RRh8 S$
PPhB S  PPhN S PPhZ S
PPhe S PPhp SttPPh} SQQh
SFRRh St$PPh S E  E
 E E ps E p E  E 0 E
p E E @ E E  E  {E
 oE cE @ WE KE  ?& E 
+PPh SPPh SkPPh PPh
SttPPh S<QQh SRRh
St$PPh ShE  sE  gE
[PPh SPPh SPPh! S
LPPh, SPPh9 SQQhK S
PRRhY SPPhf SPPhp S
PPhz SnPPh S<
& PPh SPPh S KQQh
SsRRh PPh SPPh S
PPh SPPh SPPh S
bPPh SQQh S2 '
RRh SXPPh  PPh STPPh
S\PPh! SPPh0 SPPhC
SQQhK SRRhU SnPPhb
SPPhk SIPPhw S PPh
SPPh SZPPh QQh RRh
S PPh Sl PPh S\PPh
SrPPh S PPh S PPh
S QQh S RRh) S PPh?
Sd PPhT S9 PPhj S PPh
S PPh S PPh S QQh
S RRh Sa PPh S= PPh
 S PPh  S PPh5 S PPhJ
S* PPh_ SQQht SRRh
SPPh SPPh SaPPh
S$PPh S E  E ,
E  E  E 3 E * E  E 
E E E E w & E n {E
P oE F cE  WE  KE  ?E  3E
 'PPh SPPh SQQh S
RRh Qjh S
ufUVS] u<A Aw0$ PPh
StCQQh St&PSh) VE ue[^]
E @ E  RRhF SuPj hK Ve[^]PPhN
SuE
PPhQ Su
Pj h` PPhu StEPPh~ St%QQh
S%Rj h kE #E PPh
S PPh S PPh
S PPh S PPh StdPPh
StDQQh St$RRh
SLE < WE 2 KE ( ?E  3E 
'E
E  E  PPh
Sp PPh
SL PPh'
S( PPh7
S PPh?
S PPhL
S QQhZ
S RRhq
St PPh
SP PPh
S, PPh
S PPh
S PPh
S PPh
S QQh
Sl RRh SH PPh S$ PPh/
S  PPhG S PPhX S:PPhj
S* PPh SQQh Sd RRh
S< PPh S PPh S PPh
S PPh S PPh S PPh
Sd QQh
S@ RRh& S PPhD S PPh_
S PPh{ S PPh S PPh
Sh PPh SD QQh S  RRh
S PPh
S PPh
S PPh,
S PPhE
StdPPh[
StDPPht
St$QQh
SE  E  E  E  wE 
kE  _E  S E  ;E  /E 
#E  E  E * E  E  E 
E  E  E  E  E  E 
E  E  {t& E  kE  _E  SE
3 GE 1 ;E 0 /E / # E . E -
 E , E E E E E
E E E E f E e {E d
oRRh*
St$PPh
S,E  7E  +PPh
SPPh
S$ PPh
S  PPh
S PPh
StQQh
S\RRh
SxPPh
StxPPh
StXPPh StDPPh St$PPh S
E E  E  E  E  E 
E PPh* StQQh= StRRhI S
t*PPhV StPPhk StPPh| S
PPh SPPh S PPh S
QQh SRRh S4 PPh S
 PPh S|PPh( S PPhA S
 PPhR SPPha S`QQhn S
<RRh{ SPPh SPPh S
PPh SPPh SPPh S
PPh SLQQh S(RRh S
DPPh S@PPh SPPh S
l PPh SH PPh SPPh S
  QQh  S RRh SPPh S
* PPh# S| PPh5 S8PPh@ S
4 PPhM S PPhW S QQh` S
RRhh S PPhq S PPh S
tlPPh StLPPh St,PPh St PPh
:E  SE  GE  ;E  /E  #E 
E  E  E  E  E 
E E E E  QQh
SRRh SLPPh S
oPPh SKPPh S'PPh S
PPh SGPPh SQQh S
RRh SPPh- S PPh7 S
PPh? SPPhB SgPPhJ S
CPPhS SQQh^ SRRhj S
?PPhu SPPh SPPh S
E  v E {PPh SPPh S
4PPh StQQh SHRRh S
$PPh S PPh St& '
PPh SPPh SPPh SPPh
 SPt& ' QQh S\RRh 
S8PPh& SPPh- SXPPh3
S4PPh: S*PPhB Sv
PjhI Su6USP]cw!$$ PPShR
4 tPPShY 1ҋ]Љ]PPShr  PPShx
* tPPSh  tQQSh @ tRRSh
* xPPSh  [PPSh 
.& ' PPSh  PPSh ` 
PPSh  PPSh | QQSh
 RRSh  PPSh 
dPPSh  JPPSh @ 0USP]
cwq$p PPSh  u ' ]Љ]PPShR
 tPPSh  ' tPPSh
1PPSh $ PPSh! @ QQSh& $
RRSh1  PPSh5 ` 2PPSh: 
XPPShB  PPShJ * !PPShS
t PPSh\  QQShg *
RRSho   0.01 :: $
Const.xs ModPerl::Const::compile bootstrap parameter XS_VERSION %s::%s
VERSION %s object version %s does not match %s%s%s%s %_ APR Apache
Usage: %s->compile(...) APPEND unknown APR:: constant %s BLK BLOCK_READ
BINARY BUFFERED CHR CREATE DIR DELONCLOSE EXCL ENOSTAT ENOPOOL EBADDATE
EINVALSOCK ENOPROC ENOTIME ENODIR ENOLOCK ENOPOLL ENOSOCKET ENOTHREAD
ENOTHDKEY EGENERAL ENOSHMAVAIL EBADIP EBADMASK EDSOOPEN EABSOLUTE
ERELATIVE EINCOMPLETE EABOVEROOT EBADPATH EOF EINIT ENOTIMPL EMISMATCH
EBUSY EACCES EEXIST ENAMETOOLONG ENOENT ENOTDIR ENOSPC ENOMEM EMFILE
ENFILE EBADF EINVAL ESPIPE EAGAIN EINTR ENOTSOCK ECONNREFUSED EINPROGRESS
ECONNABORTED ECONNRESET ETIMEDOUT EHOSTUNREACH ENETUNREACH EFTYPE EPIPE
EXDEV ENOTEMPTY END FILEPATH_NOTABOVEROOT FILEPATH_SECUREROOTTEST
FILEPATH_SECUREROOT FILEPATH_NOTRELATIVE FILEPATH_NOTABSOLUTE
FILEPATH_NATIVE FILEPATH_TRUENAME FINFO_LINK FINFO_MTIME FINFO_CTIME
FINFO_ATIME FINFO_SIZE FINFO_CSIZE FINFO_DEV FINFO_INODE FINFO_NLINK
FINFO_TYPE FINFO_USER FINFO_GROUP FINFO_UPROT FINFO_GPROT FINFO_WPROT
FINFO_ICASE FINFO_NAME FINFO_MIN FINFO_IDENT FINFO_OWNER FINFO_PROT
FINFO_NORM FINFO_DIRENT FLOCK_SHARED FLOCK_EXCLUSIVE FLOCK_TYPEMASK
FLOCK_NONBLOCK GREAD GWRITE GEXECUTE HOOK_REALLY_FIRST HOOK_FIRST
HOOK_MIDDLE HOOK_LAST HOOK_REALLY_LAST LNK LOCK_FCNTL LOCK_FLOCK
LOCK_SYSVSEM LOCK_PROC_PTHREAD LOCK_POSIXSEM LOCK_DEFAULT LIMIT_CPU
LIMIT_MEM LIMIT_NPROC LIMIT_NOFILE NOFILE NONBLOCK_READ OVERLAP_TABLES_SET
OVERLAP_TABLES_MERGE PIPE POLLIN POLLPRI POLLOUT POLLERR POLLHUP POLLNVAL
REG READ SOCK SHUTDOWN_READ SHUTDOWN_WRITE SHUTDOWN_READWRITE SUCCESS
SO_LINGER SO_KEEPALIVE SO_DEBUG SO_NONBLOCK SO_REUSEADDR SO_SNDBUF
SO_RCVBUF SO_DISCONNECTED TRUNCATE UNKFILE UREAD UWRITE UEXECUTE
URI_FTP_DEFAULT_PORT URI_SSH_DEFAULT_PORT URI_TELNET_DEFAULT_PORT
URI_GOPHER_DEFAULT_PORT URI_HTTP_DEFAULT_PORT URI_POP_DEFAULT_PORT
URI_NNTP_DEFAULT_PORT URI_IMAP_DEFAULT_PORT URI_PROSPERO_DEFAULT_PORT
URI_WAIS_DEFAULT_PORT URI_LDAP_DEFAULT_PORT URI_HTTPS_DEFAULT_PORT
URI_RTSP_DEFAULT_PORT URI_SNEWS_DEFAULT_PORT URI_ACAP_DEFAULT_PORT
URI_NFS_DEFAULT_PORT URI_TIP_DEFAULT_PORT URI_SIP_DEFAULT_PORT
URI_UNP_OMITSITEPART URI_UNP_OMITUSER URI_UNP_OMITPASSWORD
URI_UNP_OMITUSERINFO URI_UNP_REVEALPASSWORD URI_UNP_OMITPATHINFO
URI_UNP_OMITQUERY WREAD WWRITE WEXECUTE WRITE APR:: AUTH_REQUIRED
ACCESS_CONF unknown Apache:: constant %s CRLF
CR DIR_MAGIC_TYPE httpd/unix-directory DECLINED DONE DECLINE_CMD  FLAG
FORBIDDEN FTYPE_RESOURCE FTYPE_CONTENT_SET FTYPE_PROTOCOL FTYPE_TRANSCODE
FTYPE_CONNECTION FTYPE_NETWORK HTTP_CONTINUE HTTP_SWITCHING_PROTOCOLS
HTTP_PROCESSING HTTP_OK HTTP_CREATED HTTP_ACCEPTED HTTP_NON_AUTHORITATIVE
HTTP_NO_CONTENT HTTP_RESET_CONTENT HTTP_PARTIAL_CONTENT HTTP_MULTI_STATUS
HTTP_MULTIPLE_CHOICES HTTP_MOVED_PERMANENTLY HTTP_MOVED_TEMPORARILY
HTTP_SEE_OTHER HTTP_NOT_MODIFIED HTTP_USE_PROXY HTTP_TEMPORARY_REDIRECT
HTTP_BAD_REQUEST HTTP_UNAUTHORIZED HTTP_PAYMENT_REQUIRED HTTP_FORBIDDEN
HTTP_NOT_FOUND HTTP_METHOD_NOT_ALLOWED HTTP_NOT_ACCEPTABLE
HTTP_REQUEST_TIME_OUT HTTP_CONFLICT HTTP_GONE HTTP_LENGTH_REQUIRED
HTTP_PRECONDITION_FAILED HTTP_REQUEST_ENTITY_TOO_LARGE
HTTP_REQUEST_URI_TOO_LARGE HTTP_UNSUPPORTED_MEDIA_TYPE
HTTP_RANGE_NOT_SATISFIABLE HTTP_EXPECTATION_FAILED
HTTP_UNPROCESSABLE_ENTITY HTTP_LOCKED HTTP_FAILED_DEPENDENCY
HTTP_INTERNAL_SERVER_ERROR HTTP_NOT_IMPLEMENTED HTTP_BAD_GATEWAY
HTTP_SERVICE_UNAVAILABLE HTTP_GATEWAY_TIME_OUT HTTP_VARIANT_ALSO_VARIES
HTTP_INSUFFICIENT_STORAGE HTTP_NOT_EXTENDED ITERATE ITERATE2 LF LOG_EMERG
LOG_ALERT LOG_CRIT LOG_ERR LOG_WARNING LOG_NOTICE LOG_INFO LOG_DEBUG
LOG_LEVELMASK LOG_TOCLIENT LOG_STARTUP MPMQ_NOT_SUPPORTED MPMQ_STATIC
MPMQ_DYNAMIC MPMQ_MAX_DAEMON_USED MPMQ_IS_THREADED MPMQ_IS_FORKED
MPMQ_HARD_LIMIT_DAEMONS MPMQ_HARD_LIMIT_THREADS MPMQ_MAX_THREADS
MPMQ_MIN_SPARE_DAEMONS MPMQ_MIN_SPARE_THREADS MPMQ_MAX_SPARE_DAEMONS
MPMQ_MAX_SPARE_THREADS MPMQ_MAX_REQUESTS_DAEMON MPMQ_MAX_DAEMONS
MODE_READBYTES MODE_GETLINE MODE_EATCRLF MODE_SPECULATIVE MODE_EXHAUSTIVE
MODE_INIT M_GET M_PUT M_POST M_DELETE M_CONNECT M_OPTIONS M_TRACE M_PATCH
M_PROPFIND M_PROPPATCH M_MKCOL M_COPY M_MOVE M_LOCK M_UNLOCK
M_VERSION_CONTROL M_CHECKOUT M_UNCHECKOUT M_CHECKIN M_UPDATE M_LABEL
M_REPORT M_MKWORKSPACE M_MKACTIVITY M_BASELINE_CONTROL M_MERGE M_INVALID
METHODS NO_ARGS NOT_FOUND OPT_NONE OPT_INDEXES OPT_INCLUDES OPT_SYM_LINKS
OPT_EXECCGI OPT_UNSET OPT_INCNOEXEC OPT_SYM_OWNER OPT_MULTI OPT_ALL OK
OR_NONE OR_LIMIT OR_OPTIONS OR_FILEINFO OR_AUTHCFG OR_INDEXES OR_UNSET
OR_ALL RAW_ARGS REDIRECT RSRC_CONF REMOTE_HOST REMOTE_NAME REMOTE_NOLOOKUP
REMOTE_DOUBLE_REV SATISFY_ALL SATISFY_ANY SATISFY_NOSPEC SERVER_ERROR
TAKE1 TAKE2 TAKE12 TAKE3 TAKE23 TAKE123 TAKE13 Apache:: common unknown
apr:: group `%s' error filetype fileperms filepath filemode finfo flock
hook lockmech limit poll read_type shutdown_how socket table uri cmd_how
config unknown apache:: group `%s' filter_type http input_mode log mpmq
methods options override platform remotehost satisfy types   H
x   * * * \ * p 
 *     *  7 `   ` ` ` t ' `
` L' ( - . ` ` / 0 1
2 $2 <2 S2 $2 3 $2 $2 $2 3 $2 $2 $2 Q3 $2 k3 3 3 3
4 t4 t4 4 t4 4 4 t4 t4 4 4 t4 5 O5 t4 i5 5
5
HTTP_PROXY_AUTHENTICATION_REQUIRED
NLMI 

V   
           
     ! 0 
  ' < Q a  ! , 9 K Y    
    
    " + 6 > F M
U ] g q {    *         
     " * 1 8 ? F L S Z a g p }
          s ~    * 
        ! - 8 B N Z e p f
p z  C     ) ? T j   
  
  5 J _ t      K
U b k w         }  
 Q         -
7 * = I V k |     
  ( A       *

 & - 3 : B u ~ ? 

  R a n {  

'
7
?
L
Z
q







 / G X j 
& D _ { 

,
E
[
t

               # 5
@ M W ` h q      F N
B J S ^ j u  
 







      
.text
 5 .rodata *7  .data *7 D .bss .comment dT
\ .note T P GCC: (GNU) 2.95.3 20010315 (release) GCC: (GNU)
3.2.3 GCC: (GNU) 3.2.3 GCC: (GNU) 3.2.3   01.01  
01.01   01.01   01.01
libperlapache2aprliblibcBAGF   
MPK_Bag MT Safe NLM
  @ @ @2 @G @ @ @ @* @ @ @ @ @ @_
@ @ @ @  @ @0 @B @J @U @` @r @z @ @ @ @ @
@ @ @ @  @! @1 @A @Q @ @ @ @ @ @ @& @< C
  w @: @F @ @ @ @ @ @ @
@K @_ @ @ @ @ @
@
@)
@A
@Y
@q
@
@
@
@
@
@ @ @1 @I @a @y @ @ @ @ @ @ @! @9 @Q @i @
@ @ @ @ @ @
@)
@A
@Y
@q
@
@
@
@
@
@ @ @1 @I @a @y @ @ @ @ @{ @ @ @ @ @ @  @#
@; @S @k @ @ @ @ @ @ @ @+ @C @[ @s @ @ @ @
@ @ @ @3 @G @_ @w @ @ @ @ @ @ @ @ @# @_ @w
@ @ @ @ @ @ @ @7 @O @s @ @ @ @ @ @ @ @)
@A @Y @ @ @ @ @ @ @ @ @7 @O @g @ @ @ @ @
@ @ @ @' @? @W @o @ @ @ @ @ @ @ @/ @G @_ @w
@ @ @ @ @ @ @ @7 @O @g @ @ @ @ @ @ @3 @:
@N @c @ @ @ @ @ @ @ @& @? @c @{ @ @ @ @ @
@ @w @ @ @ @ @ @! @! @7! @O! @g! @! @! @! @! @! @!
@" @'" @?" @W" @o" @" @" @" @" @" @" @# @/# @G# @_# @w# @# @#
@# @# @# @$ @$ @7$ @O$ @g$ @$ @$ @$ @$ @ ' @' @O' @g' @' @'
@' @' @' @' @( @( @/( @C( @( @( @( @( @( @) @/) @G) @_) @w)
@) @) @) @) @) @* @* @7* @O* @g* @* @* @* @* @* @* @+ @'+ @?
+ @W+ @o+ @+ @+ @+ @+ @+ @+ @, @/, @G, @_, @w, @, @, @, @, @,
@, @- @- @- @. @4. @L. @d. @|. @. @. @. @. @. @ / @$/ @</ @T/
@l/ @/ @/ @/ @/ @/ @0 @/0 @C0 @[0 @s0 @0 @0 @0 @0 @0 @#1 @;1
@S1 @k1 @1 @1 @1 @1 @2 @2 @2 @(2 @@2 @J2 @W2 @a2 @p2 @z2 @2 @2 @
2 @2 @2 @2 @2 @2 @3 @3 @3 @(3 @;3 @E3 @U3 @_3 @o3 @y3 @3 @3 @3
@3 @3 @3 @3 @3 @4 @4 @4 @=4 @G4 @V4 @`4 @x4 @4 @4 @*4 @4 @4
@4 @4 @4 @4 @4 @5 @ 5 @5 @&5 @95 @C5 @S5 @]5 @m5 @w5 @5 @5 @5
@5 @x |         *      
                 
        $ ( , 0 4 8 < @ D H
L P T X \ ` d h l p t x |     
   *      *        
                @ D
H L P T X ` d h l p t |      *
                 
       $ ( , 0 4 8 < @ D H L
P T X \ ` d h l p t x |      
  *               
              $ ( , 4
@ D H L P T X \ ` d h l p t x |  
      *           
        $ ( , 0 4 8 < @ D `
d h l p t x |        *   
                 
 $ ( , 0 4 8 @ D H L P T X \ ` d h
l p t x |         *    
                 
         $ ( , 0 4 8 < @ D H
L P T X \ ` d h l t x |  *    
                 
  $ ( , 0 4 8 Delete_CLIB_OPT_FromCommandLine >
@DllMain  @ [ @LIBPERL@Perl_croak  @L @ @i
@LIBPERL@Perl_croak_nocontext -2 @}4 @LIBPERL@Perl_form  @
@LIBPERL@Perl_get_sv  @ @LIBPERL@Perl_gv_init 
@LIBPERL@Perl_gv_stashpv  @X @LIBPERL@Perl_hv_fetch F @
@LIBPERL@Perl_newCONSTSUB q @LIBPERL@Perl_newSViv  @
@LIBPERL@Perl_newSVpv  @LIBPERL@Perl_newXS 
@LIBPERL@Perl_sv_2pv_flags k @  @  @% @
LibCPostStart  @ LibCPreStart @
StartNLMInNKS  @TerminateNLMFromNKS 9 @_NonAppCheckUnload
 @ _NonAppStart { @ L @ _NonAppStop  @___errno g
@ @* @__deinit_environment * @__init_environment 
@_set_vm_context  @ @ @ @main  @ memset
@register_library w @strcmpp  @ @ @ @ @ @Q
@e @ @ @ @ @ @
@/
@G
@_
@w
@
@
@
@
@
@ @ @7 @O @g @ @ @ @ @ @ @ @' @? @W @o @ @ @
@ @ @ @
@/
@G
@_
@w
@
@
@
@
@
@ @ @7 @O @g @ @ @ @ @ @ @ @ @ @ @ @ @)
@A @Y @q @ @ @ @ @ @ @ @1 @I @a @y @ @ @ @
@ @  @! @9 @M @e @} @ @ @* @ @ @ @ @) @e @} @
@* @ @ @ @
 @% @= @U @y @ @ @ @ @ @ @/ @G @_ @ @* @ @ @ @
 @% @= @U @m @ @ @ @ @ @ @- @E @] @u @ @ @ @
@ @ @ @5 @M @e @} @ @* @ @ @ @
 @% @= @U @m @ @ @ @ @@ @T @* @ @ @ @ @, @i @
@ @ @ @ @ @ @} @ @* @ @ @ @
! @%! @=! @U! @m! @! @! @! @! @! @! @" @-" @E" @]" @u" @" @" @"
@" @" @# @# @5# @M# @e# @}# @# @*# @# @# @# @
$ @%$ @=$ @U$ @m$ @$ @$ @*$ @$ @' @%' @U' @m' @' @' @' @' @' @' @
( @!( @5( @I( @( @( @( @( @) @) @5) @M) @e) @}) @) @*) @) @) @) @
* @%* @=* @U* @m* @* @* @* @* @* @* @+ @-+ @E+ @]+ @u+ @+ @+ @+
@+ @+ @, @, @5, @M, @e, @}, @, @*, @, @, @, @- @- @. @". @:. @R.
@j. @. @. @. @. @. @. @/ @*/ @B/ @Z/ @r/ @/ @/ @/ @/ @0 @0
@50 @I0 @a0 @y0 @0 @0 @0 @0 @1 @)1 @A1 @Y1 @q1 @1 @1 @1 @2 @E2
@\2 @u2 @2 @2 @2 @2 @ 3 @#3 @@3 @Z3 @t3 @3 @3 @3 @3 @4 @B4 @[4
@4 @4 @4 @4 @4 @5 @!5 @>5 @X5 @r5 @5 @5 @strlen 1
@strncmp , @ @1 @unregister_library  @i
@MPCONST@boot_ModPerl__Const -
MPCONST@modperl_constants_group_lookup_apache3
MPCONST@XS_modperl_const_compile 
'MPCONST@modperl_constants_lookup_apache
MPCONST@modperl_const_compile  $MPCONST@modperl_constants_lookup_apr`
*MPCONST@modperl_constants_group_lookup_apr1 /B>&nbsp;Send the HTTP
response headers, if this has not already occurred.</TD>
</TR>
<TR BGCOLOR="white" CLASS="TableRowColor">
<TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1">
<CODE>&nbsp;void</CODE></FONT></TD>
<TD><CODE><B><A
HREF="../../../../org/apache/catalina/connector/HttpResponseBase.html#sendR
edirect(java.lang.String)">sendRedirect</A></B>
(java.lang.String&nbsp;location)</CODE>

<BR>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&n bsp;&nbsp;<B>Depreca=head
1 NAME

Encode::Supported -- Encodings supported by Encode

=head1 DESCRIPTION

=head2 Encoding Names

Encoding names are case insensitive. White space in names
is ignored. In addition, an encoding may have aliases.
Each encoding has one "canonical" name. The "canonical"
name is chosen from the names of the encoding by picking
the first in the following sequence (with a few exceptions).

=over 4

=item *

The name used by the Perl community. That includes 'utf8' and 'ascii'.
Unlike aliases, canonical names directly reach the method so such
frequently used words like 'utf8' don't need to do alias lookups.

=item *

The MIME name as defined in IETF RFCs. This includes all "iso-"s.

=item *

The name in the IANA registry.

=item *

The name used by the organization that defined it.

=back

In case I<de jure> canonical names differ from that of the Encode
module, they are always aliased if it ever be implemented. So you can
safely tell if a given encoding is implemented or not just by passing
the canonical name.

Because of all the alias issues, and because in the general case
encodings have state, "Encode" uses an encoding object internally
once an operation is in progress.

=head1 Supported Encodings

As of Perl 5.8.0, at least the following encodings are recognized.
Note that unless otherwise specified, they are all case insensitive
(via alias) and all occurrence of spaces are replaced with '-'.
In other words, "ISO 8859 1" and "iso-8859-1" are identical.

Encodings are categorized and implemented in several different modules
but you don't have to C<use Encode::XX> to make them available for
most cases. Encode.pm will automatically load those modules on demand.

=head2 Built-in Encodings

The following encodings are always available.

Canonical Aliases Comments & References
----------------------------------------------------------------
ascii US-ascii ISO-646-US [ECMA]
ascii-ctrl Special Encoding
iso-8859-1 latin1 [ISO]
null Special Encoding
utf8 UTF-8 [RFC2279]
----------------------------------------------------------------

I<null> and I<ascii-ctrl> are special. "null" fails for all character
so when you set fallback mode to PERLQQ, HTMLCREF or XMLCREF, ALL
CHARACTERS will fall back to character references. Ditto for
"ascii-ctrl" except for control characters. For fallback modes, see
L<Encode>.

=head2 Encode::Unicode -- other Unicode encodings

Unicode coding schemes other than native utf8 are supported by
Encode::Unicode, which will be autoloaded on demand.

----------------------------------------------------------------
UCS-2BE UCS-2, iso-10646-1 [IANA, UC]
UCS-2LE [UC]
UTF-16 [UC]
UTF-16BE [UC]
UTF-16LE [UC]
UTF-32 [UC]
UTF-32BE UCS-4 [UC]
UTF-32LE [UC]
UTF-7 [RFC2152]
----------------------------------------------------------------

To find how (UCS-2|UTF-(16|32))(LE|BE)? differ from one another,
see L<Encode::Unicode>.

UTF-7 is a special encoding which "re-encodes" UTF-16BE into a 7-bit
encoding. It is implemeneted seperately by Encode::Unicode::UTF7.

=head2 Encode::Byte -- Extended ASCII

Encode::Byte implements most single-byte encodings except for
Symbols and EBCDIC. The following encodings are based on single-byte
encodings implemented as extended ASCII. Most of them map
\x80-\xff (upper half) to non-ASCII characters.

=over 4

=item ISO-8859 and corresponding vendor mappings

Since there are so many, they are presented in table format with
languages and corresponding encoding names by vendors. Note that
the table is sorted in order of ISO-8859 and the corresponding vendor
mappings are slightly different from that of ISO. See
L<http://czyborra.com/charsets/iso8859.html> for details.

Lang/Regions ISO/Other Std. DOS Windows Macintosh Others
----------------------------------------------------------------
N. America (ASCII) cp437 AdobeStandardEncoding
cp863 (DOSCanadaF)
W. Europe iso-8859-1 cp850 cp1252 MacRoman nextstep
hp-roman8
cp860 (DOSPortuguese)
Cntrl. Europe iso-8859-2 cp852 cp1250 MacCentralEurRoman
MacCroatian
MacRomanian
MacRumanian
Latin3[1] iso-8859-3
Latin4[2] iso-8859-4
Cyrillics iso-8859-5 cp855 cp1251 MacCyrillic
(See also next section) cp866 MacUkrainian
Arabic iso-8859-6 cp864 cp1256 MacArabic
cp1006 MacFarsi
Greek iso-8859-7 cp737 cp1253 MacGreek
cp869 (DOSGreek2)
Hebrew iso-8859-8 cp862 cp1255 MacHebrew
Turkish iso-8859-9 cp857 cp1254 MacTurkish
Nordics iso-8859-10 cp865
cp861 MacIcelandic
MacSami
Thai iso-8859-11[3] cp874 MacThai
(iso-8859-12 is nonexistent. Reserved for Indics?)
Baltics iso-8859-13 cp775 cp1257
Celtics iso-8859-14
Latin9 [4] iso-8859-15
Latin10 iso-8859-16
Vietnamese viscii cp1258 MacVietnamese
----------------------------------------------------------------

[1] Esperanto, Maltese, and Turkish. Turkish is now on 8859-9.
[2] Baltics. Now on 8859-10, except for Latvian.
[3] TIS 620 + Non-Breaking Space (0xA0 / U+00A0)
[4] Nicknamed Latin0; the Euro sign as well as French and Finnish
letters that are missing from 8859-1 were added.

All cp* are also available as ibm-*, ms-*, and windows-* . See also
L<http://czyborra.com/charsets/codepages.html>.

Macintosh encodings don't seem to be registered in such entities as
IANA. "Canonical" names in Encode are based upon Apple's Tech Note
1150. See L<http://developer.apple.com/technotes/tn/tn1150.html>
for details.

=item KOI8 - De Facto Standard for the Cyrillic world

Though ISO-8859 does have ISO-8859-5, the KOI8 series is far more
popular in the Net. L<Encode> comes with the following KOI charsets.
For gory details, see L<http://czyborra.com/charsets/cyrillic.html>

----------------------------------------------------------------
koi8-f
koi8-r cp878 [RFC1489]
koi8-u [RFC2319]
----------------------------------------------------------------

=item gsm0338 - Hentai Latin 1

GSM0338 is for GSM handsets. Though it shares alphanumerals with
ASCII, control character ranges and other parts are mapped very
differently, mainly to store Greek characters. There are also escape
sequences (starting with 0x1B) to cover e.g. the Euro sign. Some
special cases like a trailing 0x00 byte or a lone 0x1B byte are not
well-defined and decode() will return an empty string for them.
One possible workaround is

$gsm =~ s/\x00\z/\x00\x00/;
$uni = decode("gsm0338", $gsm);
$uni .= "\xA0" if $gsm =~ /\x1B\z/;

Note that the Encode implementation of GSM0338 does not implement the
reuse of Latin capital letters as Greek capital letters (for example,
the 0x5A is U+005A (LATIN CAPITAL LETTER Z), not U+0396 (GREEK CAPITAL
LETTER ZETA).

The GSM0338 is also covered in Encode::Byte even though it is not
an "extended ASCII" encoding.

=back

=head2 CJK: Chinese, Japanese, Korean (Multibyte)

Note that Vietnamese is listed above. Also read "Encoding vs Charset"
below. Also note that these are implemented in distinct modules by
countries, due to the size concerns (simplified Chinese is mapped
to 'CN', continental China, while traditional Chinese is mapped to
'TW', Taiwan). Please refer to their respective documentation pages.

=over 4

=item Encode::CN -- Continental China

Standard DOS/Win Macintosh Comment/Reference
----------------------------------------------------------------
euc-cn [1] MacChineseSimp
(gbk) cp936 [2]
gb12345-raw { GB12345 without CES }
gb2312-raw { GB2312 without CES }
hz
iso-ir-165
----------------------------------------------------------------

[1] GB2312 is aliased to this. See L<Microsoft-related naming mess>
[2] gbk is aliased to this. See L<Microsoft-related naming mess>

=item Encode::JP -- Japan

Standard DOS/Win Macintosh Comment/Reference
----------------------------------------------------------------
euc-jp
shiftjis cp932 macJapanese
7bit-jis
iso-2022-jp [RFC1468]
iso-2022-jp-1 [RFC2237]
jis0201-raw { JIS X 0201 (roman + halfwidth kana) without CES }
jis0208-raw { JIS X 0208 (Kanji + fullwidth kana) without CES }
jis0212-raw { JIS X 0212 (Extended Kanji) without CES }
----------------------------------------------------------------

=item Encode::KR -- Korea

Standard DOS/Win Macintosh Comment/Reference
----------------------------------------------------------------
euc-kr MacKorean [RFC1557]
cp949 [1]
iso-2022-kr [RFC1557]
johab [KS X 1001:1998, Annex 3]
ksc5601-raw { KSC5601 without CES }
----------------------------------------------------------------

[1] ks_c_5601-1987, (x-)?windows-949, and uhc are aliased to this.
See below.

=item Encode::TW -- Taiwan

Standard DOS/Win Macintosh Comment/Reference
----------------------------------------------------------------
big5-eten cp950 MacChineseTrad {big5 aliased to big5-eten}
big5-hkscs
----------------------------------------------------------------

=item Encode::HanExtra -- More Chinese via CPAN

Due to the size concerns, additional Chinese encodings below are
distributed separately on CPAN, under the name Encode::HanExtra.

Standard DOS/Win Macintosh Comment/Reference
----------------------------------------------------------------
big5ext CMEX's Big5e Extension
big5plus CMEX's Big5+ Extension
cccii Chinese Character Code for Information Interchange
euc-tw EUC (Extended Unix Character)
gb18030 GBK with Traditional Characters
----------------------------------------------------------------

=item Encode::JIS2K -- JIS X 0213 encodings via CPAN

Due to size concerns, additional Japanese encodings below are
distributed separately on CPAN, under the name Encode::JIS2K.

Standard DOS/Win Macintosh Comment/Reference
----------------------------------------------------------------
euc-jisx0213
shiftjisx0123
iso-2022-jp-3
jis0213-1-raw
jis0213-2-raw
----------------------------------------------------------------

=back

=head2 Miscellaneous encodings

=over 4

=item Encode::EBCDIC

See L<perlebcdic> for details.

----------------------------------------------------------------
cp37
cp500
cp875
cp1026
cp1047
posix-bc
----------------------------------------------------------------

=item Encode::Symbols

For symbols and dingbats.

----------------------------------------------------------------
symbol
dingbats
MacDingbats
AdobeZdingbat
AdobeSymbol
----------------------------------------------------------------

=item Encode::MIME::Header

Strictly speaking, MIME header encoding documented in RFC 2047 is more
of encapsulation than encoding. However, their support in modern
world is imperative so they are supported.

----------------------------------------------------------------
MIME-Header [RFC2047]
MIME-B [RFC2047]
MIME-Q [RFC2047]
----------------------------------------------------------------

=item Encode::Guess

This one is not a name of encoding but a utility that lets you pick up
the most appropriate encoding for a data out of given I<suspects>. See
L<Encode::Guess> for details.

=back

=head1 Unsupported encodings

The following encodings are not supported as yet; some because they
are rarely used, some because of technical difficulties. They may
be supported by external modules via CPAN in the future, however.

=over 4

=item ISO-2022-JP-2 [RFC1554]

Not very popular yet. Needs Unicode Database or equivalent to
implement encode() (because it includes JIS X 0208/0212, KSC5601, and
GB2312 simultaneously, whose code points in Unicode overlap. So you
need to lookup the database to determine to what character set a given
Unicode character should belong).

=item ISO-2022-CN [RFC1922]

Not very popular. Needs CNS 11643-1 and -2 which are not available in
this module. CNS 11643 is supported (via euc-tw) in Encode::HanExtra.
Autrijus Tang may add support for this encoding in his module in future.

=item Various HP-UX encodings

The following are unsupported due to the lack of mapping data.

'8' - arabic8, greek8, hebrew8, kana8, thai8, and turkish8
'15' - japanese15, korean15, and roi15

=item Cyrillic encoding ISO-IR-111

Anton Tagunov doubts its usefulness.

=item ISO-8859-8-1 [Hebrew]

None of the Encode team knows Hebrew enough (ISO-8859-8, cp1255 and
MacHebrew are supported because and just because there were mappings
available at L<http://www.unicode.org/>). Contributions welcome.

=item ISIRI 3342, Iran System, ISIRI 2900 [Farsi]

Ditto.

=item Thai encoding TCVN

Ditto.

=item Vietnamese encodings VPS

Though Jungshik Shin has reported that Mozilla supports this encoding,
it was too late before 5.8.0 for us to add it. In the future, it
may be available via a separate module. See
L<http://lxr.mozilla.org/seamonkey/source/intl/uconv/ucvlatin/vps.uf>
and
L<http://lxr.mozilla.org/seamonkey/source/intl/uconv/ucvlatin/vps.ut>
if you are interested in helping us.

=item Various Mac encodings

The following are unsupported due to the lack of mapping data.

MacArmenian, MacBengali, MacBurmese, MacEthiopic
MacExtArabic, MacGeorgian, MacKannada, MacKhmer
MacLaotian, MacMalayalam, MacMongolian, MacOriya
MacSinhalese, MacTamil, MacTelugu, MacTibetan
MacVietnamese

The rest which are already available are based upon the vendor mappings
at L<http://www.unicode.org/Public/MAPPINGS/VENDORS/APPLE/> .

=item (Mac) Indic encodings

The maps for the following are available at L<http://www.unicode.org/>
but remain unsupport because those encodings need algorithmical
approach, currently unsupported by F<enc2xs>:

MacDevanagari
MacGurmukhi
MacGujarati

For details, please see C<Unicode mapping issues and notes:> at
L<http://www.unicode.org/Public/MAPPINGS/VENDORS/APPLE/DEVANAGA.TXT> .

I believe this issue is prevalent not only for Mac Indics but also in
other Indic encodings, but the above were the only Indic encodings
maps that I could find at L<http://www.unicode.org/> .

=back

=head1 Encoding vs. Charset -- terminology

We are used to using the term (character) I<encoding> and I<character
set> interchangeably. But just as confusing the terms byte and
character is dangerous and the terms should be differentiated when
needed, we need to differentiate I<encoding> and I<character set>.

To understand that, here is a description of how we make computers
grok our characters.

=over 4

=item *

First we start with which characters to include. We call this
collection of characters I<character repertoire>.

=item *

Then we have to give each character a unique ID so your computer can
tell the difference between 'a' and 'A'. This itemized character
repertoire is now a I<character set>.

=item *

If your computer can grow the character set without further
processing, you can go ahead and use it. This is called a I<coded
character set> (CCS) or I<raw character encoding>. ASCII is used this
way for most cases.

=item *

But in many cases, especially multi-byte CJK encodings, you have to
tweak a little more. Your network connection may not accept any data
with the Most Significant Bit set, and your computer may not be able to
tell if a given byte is a whole character or just half of it. So you
have to I<encode> the character set to use it.

A I<character encoding scheme> (CES) determines how to encode a given
character set, or a set of multiple character sets. 7bit ISO-2022 is
an example of a CES. You switch between character sets via I<escape
sequences>.

=back

Technically, or mathematically, speaking, a character set encoded in
such a CES that maps character by character may form a CCS. EUC is such
an example. The CES of EUC is as follows:

=over 4

=item *

Map ASCII unchanged.

=item *

Map such a character set that consists of 94 or 96 powered by N
members by adding 0x80 to each byte.

=item *

You can also use 0x8e and 0x8f to indicate that the following sequence of
characters belongs to yet another character set. To each following byte
is added the value 0x80.

=back

By carefully looking at the encoded byte sequence, you can find that the
byte sequence conforms a unique number. In that sense, EUC is a CCS
generated by a CES above from up to four CCS (complicated?). UTF-8
falls into this category. See L<perlUnicode/"UTF-8"> to find out how
UTF-8 maps Unicode to a byte sequence.

You may also have found out by now why 7bit ISO-2022 cannot comprise
a CCS. If you look at a byte sequence \x21\x21, you can't tell if
it is two !'s or IDEOGRAPHIC SPACE. EUC maps the latter to \xA1\xA1
so you have no trouble differentiating between "!!". and S<" ">.

=head1 Encoding Classification (by Anton Tagunov and Dan Kogai)

This section tries to classify the supported encodings by their
applicability for information exchange over the Internet and to
choose the most suitable aliases to name them in the context of
such communication.

=over 4

=item *

To (en|de)code encodings marked by C<(**)>, you need
C<Encode::HanExtra>, available from CPAN.

=back

Encoding names

US-ASCII UTF-8 ISO-8859-* KOI8-R
Shift_JIS EUC-JP ISO-2022-JP ISO-2022-JP-1
EUC-KR Big5 GB2312

are registered with IANA as preferred MIME names and may
be used over the Internet.

C<Shift_JIS> has been officialized by JIS X 0208:1997.
L<Microsoft-related naming mess> gives details.

C<GB2312> is the IANA name for C<EUC-CN>.
See L<Microsoft-related naming mess> for details.

C<GB_2312-80> I<raw> encoding is available as C<gb2312-raw>
with Encode. See L<Encode::CN> for details.

EUC-CN
KOI8-U [RFC2319]

have not been registered with IANA (as of March 2002) but
seem to be supported by major web browsers.
The IANA name for C<EUC-CN> is C<GB2312>.

KS_C_5601-1987

is heavily misused.
See L<Microsoft-related naming mess> for details.

C<KS_C_5601-1987> I<raw> encoding is available as C<kcs5601-raw>
with Encode. See L<Encode::KR> for details.

UTF-16 UTF-16BE UTF-16LE

are IANA-registered C<charset>s. See [RFC 2781] for details.
Jungshik Shin reports that UTF-16 with a BOM is well accepted
by MS IE 5/6 and NS 4/6. Beware however that

=over 4

=item *

C<UTF-16> support in any software you're going to be
using/interoperating with has probably been less tested
then C<UTF-8> support

=item *

C<UTF-8> coded data seamlessly passes traditional
command piping (C<cat>, C<more>, etc.) while C<UTF-16> coded
data is likely to cause confusion (with its zero bytes,
for example)

=item *

it is beyond the power of words to describe the way HTML browsers
encode non-C<ASCII> form data. To get a general impression, visit
L<http://ppewww.ph.gla.ac.uk/~flavell/charset/form-i18n.html>.
While encoding of form data has stabilized for C<UTF-8> encoded pages
(at least IE 5/6, NS 6, and Opera 6 behave consistently), be sure to
expect fun (and cross-browser discrepancies) with C<UTF-16> encoded
pages!

=back

The rule of thumb is to use C<UTF-8> unless you know what
you're doing and unless you really benefit from using C<UTF-16>.

ISO-IR-165 [RFC1345]
VISCII
GB 12345
GB 18030 (**) (see links bellow)
EUC-TW (**)

are totally valid encodings but not registered at IANA.
The names under which they are listed here are probably the
most widely-known names for these encodings and are recommended
names.

BIG5PLUS (**)

is a proprietary name.

=head2 Microsoft-related naming mess

Microsoft products misuse the following names:

=over 4

=item KS_C_5601-1987

Microsoft extension to C<EUC-KR>.

Proper names: C<CP949>, C<UHC>, C<x-windows-949> (as used by Mozilla).

See L<http://lists.w3.org/Archives/Public/ietf-
charsets/2001AprJun/0033.html>
for details.

Encode aliases C<KS_C_5601-1987> to C<cp949> to reflect this common
misusage. I<Raw> C<KS_C_5601-1987> encoding is available as
C<kcs5601-raw>.

See L<Encode::KR> for details.

=item GB2312

Microsoft extension to C<EUC-CN>.

Proper names: C<CP936>, C<GBK>.

C<GB2312> has been registered in the C<EUC-CN> meaning at
IANA. This has partially repaired the situation: Microsoft's
C<GB2312> has become a superset of the official C<GB2312>.

Encode aliases C<GB2312> to C<euc-cn> in full agreement with
IANA registration. C<cp936> is supported separately.
I<Raw> C<GB_2312-80> encoding is available as C<gb2312-raw>.

See L<Encode::CN> for details.

=item Big5

Microsoft extension to C<Big5>.

Proper name: C<CP950>.

Encode separately supports C<Big5> and C<cp950>.

=item Shift_JIS

Microsoft's understanding of C<Shift_JIS>.

JIS has not endorsed the full Microsoft standard however.
The official C<Shift_JIS> includes only JIS X 0201 and JIS X 0208
character sets, while Microsoft has always used C<Shift_JIS>
to encode a wider character repertoire. See C<IANA> registration for
C<Windows-31J>.

As a historical predecessor, Microsoft's variant
probably has more rights for the name, though it may be objected
that Microsoft shouldn't have used JIS as part of the name
in the first place.

Unambiguous name: C<CP932>. C<IANA> name (not used?): C<Windows-31J>.

Encode separately supports C<Shift_JIS> and C<cp932>.

=back

=head1 Glossary

=over 4

=item character repertoire

A collection of unique characters. A I<character> set in the strictest
sense. At this stage, characters are not numbered.

=item coded character set (CCS)

A character set that is mapped in a way computers can use directly.
Many character encodings, including EUC, fall in this category.

=item character encoding scheme (CES)

An algorithm to map a character set to a byte sequence. You don't
have to be able to tell which character set a given byte sequence
belongs. 7-bit ISO-2022 is a CES but it cannot be a CCS. EUC is an
example of being both a CCS and CES.

=item charset (in MIME context)

has long been used in the meaning of C<encoding>, CES.

While the word combination C<character set> has lost this meaning
in MIME context since [RFC 2130], the C<charset> abbreviation has
retained it. This is how [RFC 2277] and [RFC 2278] bless C<charset>:

This document uses the term "charset" to mean a set of rules for
mapping from a sequence of octets to a sequence of characters, such
as the combination of a coded character set and a character encoding
scheme; this is also what is used as an identifier in MIME "charset="
parameters, and registered in the IANA charset registry ... (Note
that this is NOT a term used by other standards bodies, such as ISO).
[RFC 2277]

=item EUC

Extended Unix Ch