/*********************************************************************
   C_DECODE.TXT: TechDoc on Combinatorial Compression

   C++ code for encoding and decoding of combinatorial design files
   implemented in the program XVRFY v1.03 and higher for optional use.

   It describes the format of the option -g1 only, which is
   Text Compression. The format of the Binary Compression (-g2,
   and higher formats in the future) will be published sometimes later.


   Author : Uenal Mutlu
   Date   : 960907Sa
   Version: 1.09

   History:
   --------
   960826Mo 1.03 r -Initial version, released in XVRFY13B.ZIP

   960907Sa 1.09   -Bug in description corrected: the resulting checksums
                     must be taken mod(65556L) as was shown in the code,
                     not mod(2^16) as was written in the text part.
                     It's a "magic" number. For compatibility reasons
                     all programs using this must use the same constant.
                    -IdHdr slightly changed: see bCompr below.

   961209Mo 1.25  r -cosmetic   


   OVERVIEW:
   ---------

   Decoding Structure:
   -------------------
     ...
     Open text file
       read the ID line
       allocate array
       call CompressedInput(.)
     Close file
     ...


   Encoding Structure:
   -------------------
     ...
     Do a complete sort of the design data (inside blocks and the blocks
     itself too) in ascending order
     Open text file
       Write ID line
       Call CompressedOutput(.)
     Close file
     ...


   PRELIMINARIES:
   --------------
   - all values are unsigned, and in the range 0..2^32-1

   - comment chars are '#', ';'. They and the rest of the line will
     be ignored

   - seperators are: ',', space, '/', '\t', '\r','\n' and everything else
     except digits and '^' (currently we dont use any float-values)

   - lineend has the usual '\n' or '\r\n'  (DOS text file)

   - all lines, incl. commentlines, should not be more than 256 chars long

   - The first non-space char does not need be at the beginning of
     the line; ie.: spaces can be prior it

   - SysFiles consists of an ID line, which also tells if the data is
     compressed, and data lines. The structure of the ID line is:
       ID v b k v1 w1/w2 t/m L x Base bCompr CHdrItems CEndItems CAvgLineLen CfExpUsed ...
     where:
       - v1 is normally the same as v
       - w1 and w2 are currently unused
       - if bCompr is 1 or 2, then Base is also 1
     The total number of entries can vary, but should not be less than
     shown here. The defaults are:
       - v1 = v
       - w1 = w2 = 0
       - L  = 1
       - x  = 2
       - Base = 1
       - bCompr    = 0  (0=none, 1=ComprText, 2=ComprBin, 3..255=other compr.algorithms)
       - CHdrItems = 4      (4..31)
       - CEndItems = 2      (2..31)
       - CAvgLineLen = 64   (32..224; avarage linelen for compressed lines)
       - CfExpUsed = 1      (if '^' was used; ie. RLE)

   - All design files (compressed or not, text or binary format), have an
     ID _line_ at top of the file: from that, one can see if and which
     compression scheme was used and so call the appropriate decoding
     and/or processing routine. To this scheme described here belongs
     the bCompr value "1" (see above).

   - Structure of the compressed data
       - HdrItems(4..31)  (cHdrItems,b,v,k)
       - DataItems(b)     (ie. b values)
       - EndItems(2..31)  (checksums are stored here)

   - Even if the ID line is not present, the data still can be decoded,
     because the relevant data (cHdrItems,b,v,k) can be retrieved from
     the beginning of the first code line... But, it's not recommended
     to omit the ID line. t,m,l and other data would be lost.

   - optional lines start with alphabetic keywords or commentchar

   - code lines start with digits as the first char (no comma, ^, or anything
     else but digits)

   - c^n means n times the nbr c (used for dataitems only)

   - code lines have an average length of AvgLineLen

   - currently 2 checksums (simple cumulations) appended at the end of
     the code:
       checksum1: checksum of the combinatorial indices (each 1..Cn(v,k));
         initially 0
         beware1: does not include any HDRITEMS and ENDITEMS vals!
         beware2: of the resulting sum, only mod(65556L) has to be taken!
       checksum2: checksum of the compressed entries; initially 0
         beware1: does not include any HDRITEMS and ENDITEMS vals!
         beware2: of the resulting sum, only mod(65556L) has to be taken!
         beware3: checksum is calcd before the exponential format is build

   - only the design data will be compressed, additional lines
     starting with non-digits can still be inserted into the file,
     since it's a normal text file.

   - no data line can start with ',' or '^'

   - Duplicate blocks, as can happen with non-simple L>1 designs, will be
     handled too.

   --> compressed data can be sent in emails and posted to newsgroups
       since they contain printable ascii-chars only (mainly digits,
       comma and the exp sign '^').

   --> code lines should never be manipulated or splitted!
       --> as is also the case with uuencoded files...
       In XVRFY the length of the line can be set individually.

   --> the resulting design of compressed data is always sorted


   Data Types Used:
   ----------------
     TULONG  = unsigned long int
     TUSHORT = unsigned short int
     TSSHORT = signed short int
     TCHAR   = signed char
     TBYTE   = unsigned char
     TBOOL   = TBYTE
     TSReihe = 2-dim-array of at least b elements with at least
               k TBYTE elements each (I use an array of structs).

   Simple External Functions not shown here:
   -----------------------------------------
     TCHAR* fgets2(TCHAR* buf, TUSHORT buflen, FILE* fp)
       Reads from fp into buf and converts any tabs into spaces
       and comment chars into '\0'. Returns buf, or NULL if EOF.

     TCHAR* StrFirstChar(TCHAR* sz)
       Returns ptr to the first nonspace char in sz; can also point
       to '\0' at the end of sz if none were found.

     TULONG Combi(TUSHORT v, TUSHORT k)
       returns Cn(v,k)


   Further Info:
   -------------
     Tested with Borland C++ v4.52 in W32-mode only (and discovered
     a bad optimizer bug AND an error with setmode!); but should work
     with any modern (that should mean: 'better' :) C++ implementations.

     Also, consult the DOC of XVRFY 1.03 or higher and the compressed
     sample design files there.

     The author can be contacted at: bm373592@muenchen.org

     Also check the following locations for new versions and other files
       http://www.tuco.com/math1.htm
       ftp.tuco.com/pub/math/


   Copyright (c) 1996 Uenal Mutlu. All rights reserved.
   You can use this encoding/decoding scheme freely if a reference to
   the author is made in the manual.

*********************************************************************/
//...
extern TCHAR* fgets2(TCHAR* buf, TUSHORT buflen, FILE* fp);
extern TCHAR* StrFirstChar(TCHAR* sz);
extern TULONG Combi(TUSHORT v, TUSHORT k);

#define printf2        printf
#define Blocknbr2Block ReihenNr2Reihe


TULONG Blocknbr2Block(TULONG nr, TUSHORT v, TUSHORT k, TBYTE* R);
  { /* Fills R with the corrosponding blockdata of the complete design.
       R must be at least k bytes long.
       nr must be in the range 1..Cn(v,k).
       Returns nr, or 0 in case of invalid args or err.
       The block data is autom. sorted in ascending order.
       (NURMELA/STERGARD (cf. XVRFY -n) have similar funcs and call
       them 'rankSubset' and 'unrankSubset'. Beware: IMHO, this is
       not compatible to their implementation!).

       Example: nr=1,v=49,k=6        will give   1  2  3  4  5  6
                nr=13983816,v=49,k=6 will give  44 45 46 47 48 49

       Plausibility checks (ie. v < k etc.) omitted.
    */

    TULONG  t = 0;
    TSSHORT s = 1;
    for (TSSHORT p = 1; p <= k; p++)
      {
        for (TSSHORT i = s; i <= (v - k + p); i++)
          {
            TULONG c = Combi(v - i, k - p);
            if ((t + c) >= nr)
              break;
            t += c;
          }

        R[p - 1] = i;
        s = i + 1;
      }

    return nr;
  }

TULONG CompressedInput(TSReihe* pReihe, FILE* fp, TUSHORT Av, TUSHORT Ak,
               TULONG Ab, TULONG AulHdrItems = 4, TULONG AulEndItems = 2)
  { // fp should be positioned after the ID line
    // Ab is the number of DataItems (taken from the ID line)
    // pReihe must have at least Ab elems allocated
    // Currently HdrItems >= 4, EndItems >= 2 (taken from the ID line)
    // rc = Ab, or 0 if err
    // BEWARE: ulMagicMod is and always has been 65556L (not 2^16) !!!



    // AulHdrItems must be exact and at least 4
    // AulEndItems must be at least 2
    if (AulHdrItems < 4         || AulHdrItems > 31)
      return 0;
    if (AulEndItems < 2         || AulEndItems > 31)
      return 0;

    const  TULONG ulMagicMod = 65556L;    // beware: this is not 2^16 !!!
    const  TCHAR *const szDelim = "\t,/\r\n ";
    TCHAR  sz[256];
    TULONG ulLfd = 0, b = 0, ulLastReihe = 0;

    // hdritems will be read here: >=4
    TULONG  cH = AulHdrItems, bH = Ab;
    TUSHORT vH = Av,          kH = Ak;

    // idx of EndItems: the 2 checksums in file (1-based!)
    const TULONG ulCS1Ix = AulHdrItems + Ab + 1;
    const TULONG ulCS2Ix = ulCS1Ix + 1;
    // ... put additional EndItems here and see below for the assignments

    // checksums stored in file will be put here
    TULONG ulCS1File = 1, ulCS2File = 1;

    // our checksum calculations here
    TULONG ulCheckSum1 = 0, ulCheckSum2 = 0;

    while (1)
      {
        if (!fgets2(sz, sizeof(sz), fp))
          break;

        TCHAR* p = StrFirstChar(sz);
        if (!isdigit(*p))   // first nonspace-char must be a digit
          continue;

        TCHAR* pTok = NULL;
        while ((pTok = strtok(pTok ? NULL : p, szDelim)) && isdigit(*pTok))
          {
            TULONG ulVal = atol(pTok);  // value read in
            TULONG ulExp = 1;

            TCHAR* pExp = strchr(pTok, '^');
            if (pExp)
              {
                ulExp = atol(pExp + 1);
                if (!ulExp)
                  goto lblA;  // err
              }

            for (TULONG ul = 0; ul < ulExp; ul++)
              {
                ulLfd++;  // 1-based

                if (ulLfd <= AulHdrItems)
                  { // HDRITEMS: cHdrItems,b,v,k,...

                    if      (ulLfd == 1 && ((cH = ulVal) != AulHdrItems))
                      goto lblA;
                    else if (ulLfd == 2 && ((bH = ulVal) != Ab))
                      goto lblA;
                    else if (ulLfd == 3 && ((vH = ulVal) != Av))
                      goto lblA;
                    else if (ulLfd == 4 && ((kH = ulVal) != Ak))
                      goto lblA;
                    //... put additional else's here if more HdrItems
                    //...
                  }
                else if (ulLfd >= ulCS1Ix)
                  { // ENDITEMS: cs1,cs2,...








                    if      (ulLfd == ulCS1Ix)
                      ulCS1File = ulVal;
                    else if (ulLfd == ulCS2Ix)
                      ulCS2File = ulVal;

                    //... put additional else's here if more EndItems
                    //...
                    else  // == dont check rest, if exist; instead go to end
                      goto lblA;
                  }
                else
                  { // DATAITEMS: <Ab> values between HdrItems and EndItems

                    ulCheckSum2 += ulVal;
                    ulLastReihe += ulVal;
                    ulCheckSum1 += ulLastReihe;
                    ReihenNr2Reihe(ulLastReihe, vH, kH, &pReihe[b]);
                    b++;
                  }
              }
          }
      }

lblA:
 // printf2("\nulLfdEnd=%lu\n", ulLfd);

    ulCheckSum1 %= ulMagicMod;
    ulCheckSum2 %= ulMagicMod;

    TBOOL fOk = TRUE;

    if (cH != AulHdrItems)
      fOk = FALSE, printf2("Err cH=%lu vs. AulHdrItems=%lu\n", cH, AulHdrItems);
    else if (bH != Ab)
      fOk = FALSE, printf2("Err bH=%lu vs. Ab=%lu\n", bH, Ab);
    else if (vH != Av)
      fOk = FALSE, printf2("Err vH=%lu vs. Av=%lu\n", vH, Av);
    else if (kH != Ak)
      fOk = FALSE, printf2("Err kH=%lu vs. Ak=%lu\n", kH, Ak);
    else if (!b || b != Ab)
      fOk = FALSE, printf2("Err b=%lu vs. Ab=%lu\n", b, Ab);
    else if (ulCheckSum1 != ulCS1File)
      fOk = FALSE, printf2("Err CS1=%lu vs. CS1File=%lu\n", ulCheckSum1, ulCS1File);
    else if (ulCheckSum2 != ulCS2File)
      fOk = FALSE, printf2("Err CS2=%lu vs. CS2File=%lu\n", ulCheckSum2, ulCS2File);

    return fOk ? b : 0;
  }

// --- eof ---
