2014-08-29

Intervals

The current interval notation has notoriety for clashing with other notations and having weird unmatched brackets (the two cases being mutually exclusive). For instance, \(x\in[0,1]\) is ambiguous as to whether \(x\) is a fuzzy logic value (being a member of an interval) or a strict boolean value (being a member of a list). That's quite a large ambiguity for this otherwise useful notation. Also, the unacquainted can easily confuse whether a bracket is for an open end or a closed end. I've always managed to remember, but that's no excuse to keep things as they are.

I propose a simple alternative: replace \((\) with \({<}|\) and \([\) with \({\leq}|\), and reflect for the right-hand end. So, our fuzzy logic interval would be \({\leq}|0,1|{\geq}\). Intervals of integers can keep the double-dot notation, so \({\leq}|0..1|{\geq}\) represents the same thing as the list \([0,1]\). It's a list because it corresponds to a portion of \(\mathbb Z\), which I consider a list (see last post for justification).

That's about it. The best way to approximate this in ASCII is by writing things like “<=|0,1|=>”. It's better to keep the ‘=’ next to the ‘|’, so that one avoids the misparse “(<=|0,1|>) =”. And a note for \(\LaTeX\): put the ‘<’-like symbols in curly braces to avoid them spacing as if they were infix operators.

2014-08-28

Zip lists

I've laid off the maths for a while in preparation for this post. You may remember that the last post ended in a rather unsatisfactory manner. Here, I try to rectify the situation. The heart of the issue is that I want to be able to express two different types of equality between functions: the weaker set-wise complete closure and the stronger point-wise extensional equivalence.

I define a function \(f\) as being completely closed over a set \(S\) iff it satisfies these conditions (in non-mapping notation):\[\begin{aligned}
\forall x\in S.\,& f\ x\in S \\
\forall y\in S.\,\exists x\in S.\,& f\ x=y
\end{aligned}\]In other words, the domain and image of \(f\) are equal. If \(S\) is a finite set, \(f\) essentially reorders the elements. If \(S\) is infinite, there may be other things that \(f\) could do. For example, if \(f=\text{succ}\wedge S=\mathbb Z\), \(f\) is a kind of translation function. In any case, \(f\) is surjective. My aim is to be able to write this in a concise and generalisable way in mathematical notation.

If two functions are extensionally equivalent, they do the same job, and thus one can replace the other without changing the logic of the situation. However, they may have different computational properties, and hence can't be considered to be totally equal. The conditions for this relation between \(f\) and \(g\) with respect to set \(S\) can be written:\[
\forall x\in S.\,f\ x=g\ x
\]This, too, I want to have a concise and generalisable way of writing, since it is a very common relation.

So, why are these related? Well, let's start naïvely applying the principles of the notation without variables. For extensional equivalence, it is intuitive to write \(f=g\). Now, we want to limit that to some set? Maybe we say \((f=g)\ S\). That expands to \(f\ S=g\ S\) by the fork rule (since equality doesn't act upon functions). But, by implicit mapping, \(f\ S\) and \(g\ S\) are sets, so correspondences between individual elements are lost. All we've equated are the images for the functions, given a domain. If \(g\) is the identity function, the statement is that \(f\) is completely closed over \(S\).

As the title suggests, the solution lies in using lists, rather than sets. These retain an order, so there is still correspondence between individual elements in the domain and image. In fact, a function can be specified as a pair of lists, as well as a set of pairs. To say that \(f\) and \(g\) are extensionally equivalent over domain \(S\), we just have to make sure that \(S\) is cast to be a list. In symbols, \((f=g)\ \text{list}\ S\).

Both \((f=g)\ \text{list}\ S\) and \((f=g)\ S\) are useful. \(D\ {}_\cdot^2=2\cdot{}\) is an example of the former, where the implicit argument for the expression is \(\text{list}\ \mathbb C\). \((\Pi=(!))\ \text{list}\ \mathbb N\) is an example of where the list argument cannot be reliably inferred, and has a significant effect on the truth of the statement. Finally, \((\text{floor}\ \frac 2 \iota=\iota)\ \mathbb N\), where \(\iota\) is the identity function (an example of a non-bijective completely closed function).

List can occur in other situations, too. For example, \(\pm\) can be formalised by saying that it returns a 2-element list (as too does \(\sqrt \iota\), but not \({}_\cdot^{2/1}\); see an earlier post). Explicitly, \(x\pm y=[x+y,x-y]\). This gives \(\cos(A\pm B)=\cos A\cdot\sin B\mp\sin A\cdot\cos B\), which boils down to an equivalence between two lists. In fact, lists are so useful that it may be worth assuming that the standard sets are lists, which can be converted to sets as needed. This makes more sense, too, because the \(\text{list}\) function essentially introduces new information, whereas functions should lose information (like \(\text{set}\) does on lists). Also, in general, expressions like \(\text{head}\ \text{list}\ S\) are meaningless. However, given the axiom of choice, \(\text{list}\) can be defined for any set, it's just that some operations on it are meaningless.

2014-08-15

Phonotactics

This is the last piece of documentation I need before I can start creating words. That's not to say that I'll go about word creation particularly quickly. For the time being, brivla are likely to be taken from Lojban without any attempt at making them fit the correct word patterns. Remember that these rules don't apply to cmevla.

Phonotactically, but not phonemically, I distinguish between the onset and coda of a syllable. Codas can contain 0 or 1 approximants followed by 0 or 1 nasals. There are some further restrictions:
  • If followed by an obstruent, a nasal must have the same place of articulation.
  • Sequences “ij” and “uw” are not allowed unless immediately followed by a vowel.
Onsets are where most difficulty arises. There are a lot of rules to make, but with some guidelines, the number of explicit rules needed can be lessened. The main rule is that all onsets must be pronounceable at the start of a word (i.e, after a pause). This implies that all onsets start with obstruents; sonorants would be part of the preceding coda. There are two other important rules:
  • If an onset is deemed pronounceable, so is its reflection.
  • If an onset is deemed pronounceable, so is its revoicing.
The functions “reflect” and “revoice” are both self-inverse, and defined by these tables:

originalpbtdfszmnwliea
reflectionkgtdxszrnjluoa

originalptkfsxmnrwljoiaue
revoicingbdgfzxmnrwljoiaue

The name “reflect” refers to the fact that, for its input, it picks the letter on the opposite side of the corresponding table about a vertical line of symmetry. “revoice” is a more obvious name. If one imagines the consonant table being 3-dimensional, revoice is similar to reflect, but works on a different axis (the voice axis, rather than the place-of-articulation axis).

These restrictions have some interesting corollaries. For instance, it can be seen that coronal (alveolar) consonants reflect to themselves. This gives them more freedom than labial and dorsal (velar) consonants, which lives up to our expectations of them. Sonorants, by revoicing to themselves, also enjoy similar freedom. ‘f’ and ‘x’ also revoice to themselves, but are probably heavily restricted, anyway.

To simplify things further, ‘o’, ‘a’ and ‘e’ are mutually considered phonotactically equivalent. ‘i’ and ‘u’ are special cases because they are close to ‘j’ and ‘w’. For example, I don't expect ‘pji’ to be valid, since it's too similar to ‘pi’. However, ‘pja’, ‘pju’, ‘pwi’ and ‘tji’ are all valid, being distinct enough from ‘pa’, ‘pu’, ‘pi’ and ‘ti’, respectively. Of course, for ‘pja’, we need to test ‘bja’, ‘kwa’ and ‘gwa’ before we accept it (and similar for the others).

Beyond that, there is only one more easily-stated rule I can think of: onsets cannot mix voicings. All of the other phonotactic rules are left, unfortunately, to common sense. That's implicitly my common sense, but I promise to not screw up!

2014-08-05

Morphology

Before moving onto the main morphology, I'd like to get cmevla out of the way. These are practically the same as dotside Lojban's cmevla. They start with a pause (‘.’) and end with a consonant or schwa followed by a pause. The requirement for a consonant/schwa comes about because pauses can arbitrarily occur between words, so cmevla need a feature that binds them together. When using Shwa, IPA or similar orthographies, any letters can be used in cmevla (except pauses and glottal stops), even those that are not part of the language. Phonotactic rules are also done away with.

All other words begin with an obstruent (“pbtdkgfszh”) or pause, and end with a normal vowel (“oiaue”). In other words, word breaks occur in and only in these environments:
  • at a pause or glottal stop (which is part of neither word)
  • at the gap between a normal vowel and an adjacent following obstruent
This resembles Ceqli, but has a major difference: obstruents can occur in the middle of a word, in which case they were preceded by a sonorant (“mnrwlj”) or schwa (“y”). This allows much easier importation of loan words. Also – it nearly goes without saying – the scheme here is much simpler than Lojban's, with tosmabru and all. I don't rely on stress, which frees things up for songwriters. I suggest that the reason why there are barely any Lojban versions of existing songs, whether performed or not, is that it's so difficult to get the stress right (I've tried it a few times).

Other than the vowel-obstruent restriction, native words are free in form. There are no formal equivalents to lujvo, with more tanru-like structures taking their place. To keep the classes separate, brivla are required to have at least 1 nasal, and cmavo & tags have none. Like how one can recognise types of Lojban words after little practice, there shouldn't be problems telling the two types apart.

I haven't decided on an equivalent to the gismu creation algorithm yet, though I aim to have a Lojban-style one. With brivla being free-form, I'll need some way of deciding what concepts get a short word, and which get longer ones. I will likely base this on the length of the source words (by some metric).

Just to give an idea of what loan words would look like, here are some language names:
nativecmevlaIPA-style cmevlabrivla
.lojban..lozban..loʒban..lonzba
Ceqli.tjerlis..tʃeŋlis.tjerli
English.irglis..ɪŋɡlɪʃ..irgli
Esperanto.espelanton..esperanton..enspelanto
Español.espanjol..espaɲol..enspanjo
日本語.nihorgos..nihõŋɡos..nirhorgo

There are, of course, ISO code names, but I haven't worked out that system yet. Ceqli doesn't have a code, anyway. Also, stress and tone can be marked in any way that seems fit. For brivla, these have no phonemic value, but they may have in cmevla.

2014-08-04

Phonology

This should be a steady post. I post it now because I'm sick of using Lojban's FA tags when giving examples for my language. I need words, and for that I need phonemes, phonotactic rules and a writing system.

Ideally, the writing system would be Shwa (used phonemically, because things like assimilation are optional, and can vary between speakers). However, a romanisation scheme is needed. That is all I'll document here; Shwa and IPA substitutions should be obvious. Here are the consonants:
LabialCoronalDorsalGlottal
Plosivep bt dk g.
Fricativefs zh(h)
Nasalmnr
Approximantwlj

There are a few sticking points. Firstly, ‘r’ represents /ŋ/. This is just an ASCIIism derived from the shape of the lowercase letter (‘m’ → ‘n’ → ‘r’). I want newcomers to be able to write in the language without having to fiddle with their keyboard layout (though it is fun).

Then, there are the glottal consonants. ‘.’ is almost exactly the same as ‘.’ in Lojban. Notably, the dotside conventions for cmevla carry over as expected. With ‘h’, I decided to stick to the convention of many natural languages, which tend to have either /x/ or /h/, but not both. In Lojban, they're roughly in complementary distribution anyway. Instead of using Lojban's ‘'’, I use approximants to join vowels. It's your choice which of [x] and [h] you want to use, and you can mix them in your speech. Even English speakers may prefer [x] in certain circumstances.

The final thing to mention is about approximants. Unlike Lojban, I use on-glides regularly, so distinguishing between vowels and semivowels becomes necessary. Also, it should be noted that the approximants' positions on the chart are more to make the abstractions work, rather than to give phonetic information. ‘w’ and ‘j’ are the semivowels of ‘u’ and ‘i’, respectively. ‘l’ is any lateral or rhotic.

Now, the vowels:
FrontCentralBack
Closeiu
Mideyo
Opena

These are essentially the vowels of Lojban, and serve similar purposes. ‘y’ represents the schwa, which is morphologically distinct from the other vowels. In an IPAisation, it can be written as ‘ə’, but you may also want to change ‘e’ and ‘o’ to ‘ɛ’ and ‘ɔ’ to avoid them looking too similar.

The order of the vowels is “oiaue”. Tracing this out makes a star shape, thus guaranteeing maximum phonetic distance between vowels close in that sequence. The subsequence “iau” is also used (as it is in Lojban). You may or may not have noticed that the various classes of phonemes have different numbers of phonemes each. For instance, there are 3 approximants, 4 fricatives, 5 vowels and 6 plosives. These can come together to make sequences of semantically similar cmavo.

Where the Latin alphabet needs to be transcribed (like when using ISO 3166-1 codes for countries), the code letters are usually given sounds based on the ASCII orthography. I'll have to go into the morphology before describing this system, though. I will cover morphology and phonotactics soon, in that order.

2014-07-28

Syntactic sugar, part 2

A thing I forgot to mention last time was the existence of two modifiers grammatically equivalent to 〈(〉 and 〈)〉, but more restrictive. I'll call them 〈{〉 and 〈}〉. 〈{〉 is analogous to 〈(〉, but hides the following selbri to every tag cluster except the one directly to its right. 〈}〉 does the same, but reflected to the opposite direction. I'll be using them in this post.

The main focus of this post is on making new selbri from old ones. This sets it apart from the last post, which was full of “moving around” words. Were this a manual for basic Vim, the last post would document f, F, h, l, % et cetera, whereas this would document y, yy, d, p, P et cetera.

I'll start with something I introduced before: 〈vei ... ve'o〉. This takes a selbri with type \(\text{fa}:a; \: \text{fe}:b; \: \text{fi}:a\), which appears after the 〈vei〉, and any number of tag clusters of type \(b\) connected to a 〈ju'e〉 or the 〈ve'o〉. If nothing is connected to the 〈ve'o〉, the selbri should have a defined default for 〈fi〉, which is implicitly connected to the 〈ve'o〉. The structure is evaluated via a right fold of all the 〈ju'e〉s (and the 〈ve'o〉) in order.

Taking the example from before (well, slightly modified; I got it wrong), we get this reduction chain:
  1. vei ce'o 2 faju'e 3 faju'e 4 faju'e 7 faju'e ve'o
  2. kefa { 2 fafe ce'o fifa vei ce'o 3 faju'e 4 faju'e 7 faju'e ve'o ke'e
  3. kefa { 2 fafe ce'o fifa kefa { 3 fafe ce'o fifa vei ce'o 4 faju'e 7 faju'e ve'o ke'e ke'e
  4. kefa { 2 fafe ce'o fifa kefa { 3 fafe ce'o fifa kefa { 4 fafe ce'o fifa vei ce'o 7 faju'e ve'o ke'e ke'e ke'e
  5. kefa { 2 fafe ce'o fifa kefa { 3 fafe ce'o fifa kefa { 4 fafe ce'o fifa kefa { 7 fafe ce'o fifa vei ce'o ve'o ke'e ke'e ke'e ke'e
  6. kefa { 2 fafe ce'o fifa kefa { 3 fafe ce'o fifa kefa { 4 fafe ce'o fifa kefa { 7 fafe ce'o fifa kunti ke'e ke'e ke'e ke'e
〈kunti〉, by the way, comes from the Lojban word for “empty”, so represents the empty list (or set in other contexts; it's polymorphic). But the upshot is that we get what we expected – a grammatically atomic list – which can easily be translated into a more general form which uses simpler grammar. Even 〈ke... ke'e〉 can be rephrased, but it involves some messy pointer work.

This phrasing can also be used for things like sums. Taking 〈sumji〉 – “fa is the sum of fe and fi”, 〈vei sumji 2 faju'e 3 faju'e 4 faju'e 7 fave'o fafa 16〉 holds. Notice that, this time, it's okay to connect the last element straight to the 〈ve'o〉, since 4 + 7 works, whereas 4 :: 7 didn't. It has to be 4 :: 7 :: []. (‘::’ stands for the append-to-list function, type \(a\rightarrow [a]\rightarrow [a]\)).

I was going to do a section on logical connection, using these brackets to extend from binary connection to multiary connection. But I still need to work on that. Instead, I'll go over something that looks like a J conjunction. To pay homage to J, I'll write it using the J symbol ‘&’.

〈&〉 is called “appose”. It's difficult to define directly, but the rule is 〈kefi { broda fafa & fefa brode } foke'e〉 =meaning 〈kefe brode fife broda fifi brode feke'e〉, where 〈broda fa〉 and 〈brode fa〉 are functions relating their respective 〈fe〉 and 〈fi〉 terms. This will make more sense when I start using semantic tags, rather than trying to repurpose Lojban's positional tags. But, in short, if we wanted to say “Alice and Bill have the same height”, we'd say 〈.alis. fafi { du fafa & fefa cisni } fofa .bil.〉, where 〈du〉 means “fe = fi” and 〈cisni〉 means “fe has measurement fi (in dimension/aspect fo)”. This alleviates the need for places like Lojban's dunli3, and generalises well.

Because 〈{ ... fafa & fefa ... }〉 is quite heavy on words, there should be an alternative. I'll call it 〈&:〉 for now. So, the example translates as 〈.alis. fafa du &: cisni fefa .bil.〉.

A final thing I want to introduce is the existence of a math mode, delimited by 〈li ... lo'o〉. In this, the tag clusters are worked out automatically, using attributes of the selbri present. For instance, 〈li 2 sumji 3 lo'o〉 gets expanded to 〈kefa { 2 fafero sumji firofa 3 } ke'e〉. There could, potentially, be many different versions of math mode, and I haven't decided on a default one yet. If one sets out the conventions clearly and has all the words, one can use math mode to encode many other unambiguous languages.

2014-07-23

CCy cmavo in Lojban

This is a post exclusively about Lojban. If you don't know Lojban, there's no point reading it. In this post, I argue that cmavo made up of an initial consonant pair followed by the shwa (and a pause) could be incorporated into Lojban. I propose that they be used as BY, for things like diacritics and non-Lojban letters. We would gain 48 such possible cmavo.

General CCV cmavo are not allowed because they'd connect to any following brivla. In that situation, they are indistinguishable from rafsi, and are assumed to be rafsi. However, CCy cmavo, like Cy cmavo, are followed by a pause. A pause causes an unconditional word boundary, so CCy cmavo have no chance of combining with a following word.

What about combining with the previous word? Well, were any type of CCV cmavo to combine with the previous word, the only type of word it could form would be a brivla. However, brivla never end with ‘y’, so we don't have this problem.

That's wrapped up, so let's be a bit more ambitious: CGy cmavo, where G ∈ {i, u}. That would yield cmavo like *{kuy.}, which seems like a good fit for ‘Q’. The previous arguments work for this form: ‘.’ can't appear in the middle of a word and ‘y’ can't appear at the end of a brivla. These work for any form which is able to hold together as a single word and ends with ‘y’. We could have CV'y and more.

What we do need to worry about is which CG pairs are valid initials. No native Lojban words, except attitudinals, have on glides at all, and I can't even think of any fu'ivla that contain a CGV cluster.

2014-07-22

Filtration as implication

\(
\newcommand{\d}{\mathbin\cdot}
\newcommand{\abs}{\text{abs}}
\)In Marianesque mathematical notation, all mathematical statements are functions acting upon an implicit argument. This argument is taken to be the domain of the statement, and is hence a set. Implicit mapping does its magic, and everything is how we'd expect it. But often we want to make claims about part of the domain. In conventional mathematics, we'd say things like \(n>3\implies {}_\d^2n>9\). Indeed, we can still do that, but there are alternatives.

In this article, Jakub Marian defines square brackets to denote the filter function. It is defined thus for sets:\[
[p]\ xs = \{x:x\in xs\wedge p\ x\}
\]How do we use it? Well, Jakub translates classical:\[
{}_\d^2x+{}_\d^2y=1\implies\abs\ (x\d y)\in\left[0,\frac 1 2\right]
\]into:\[
\abs\ (x\d y)\ [{}_\d^2x+{}_\d^2y=1]\subseteq\left[0,\frac 1 2\right]
\]where \(x\) and \(y\) are essentially the fst and snd functions, and the implicit argument is the set of all pairs of real numbers. I use \(\abs\) rather than vertical bars for two reasons: vertical bars don't work well and I plan to have other uses for them in conjunction with brackets (to make syntax that looks similar to Haskell's quasi-quoters). Most notably, I need a notation for zip lists.

Getting back on topic, I think that this notation can be taken further. Notice how the \(\abs\ (x\d y)\) and \(\left[0,\frac 1 2\right]\) have become split up. This is not ideal. So, we can rephrase the whole thing as this:\[
\left(\abs\ (x\d y)\in\left[0,\frac 1 2\right]\right)\ [{}_\d^2x+{}_\d^2y=1]
\]I'd like to say that this has some new and interesting properties, but I can't find any right now. It does, though, give a different way of thinking about implication. It's more visual than the pure logical version, given that it uses sets (which can act like spaces).

I'd like to make some notes about relations. To make things work well, usual equality – along with the greater-than and less-than relations – are not defined for functors. This is because equality between functions makes more sense when the domain is considered. We should be able to say \((\Gamma\ \text{succ}=\text{fac})\ \mathbb{N}\). But if equality also worked on sets, our claim would become \(\Gamma\ \text{succ}\ \mathbb{N}=\text{fac}\ \mathbb{N}\), which is a weaker claim than we wanted. Hence, we have to map from the inside out, mapping over the functions, then the set on the outside. \(\equiv\) is provided for equality between anything other than functions.

An alternative that always uses \(\equiv\) is to cast sets to zip lists before forking. \((\Gamma\ \text{succ}\equiv\text{fac})\ \text{zipList}\ \mathbb{N}\) is still the stricter claim. This is because lists maintain their order, so elements can be compared pairwise. However, I'm not sure whether it's realistic to expect people to remember to cast sets to lists just to allow forking. I think, though, that this is preferable, since the result is \(\top\) or \(\bot\), not \(\{\top\}\), \(\{\top,\bot\}\) or \(\{\bot\}\) as it would be with the set solution. Maybe we should default to zip lists, even for things like \(\mathbb{C}\), then cast them to sets (or just non-zip lists) as necessary.

2014-07-21

Syntactic sugar

My aim with my language (which really needs a name!) is for everything to be expressible using a small subset of the grammar, just as Haskell has its underlying Core language. As I currently imagine it, my core language contains rules about tokenisation, understanding tag clusters and using jumps to change which selbri a tag points to. But to make the language bearable, we need a bit more. I've introduced a few features already, particularly in the last post.

I'll start with some features that change what selbri tags refer to. I've already introduced 〈)〉, but I'll start with jumps. Jumps, which I'll write 〈<〉 and 〈>〉, move the choice for the next selbri either to the left or right. This setting lasts until the end of the tag cluster. For example, we can say 〈gerku plipe <fafa〉 and 〈>fafa gerku plipe〉, which both mean the same as 〈gerku fafa plipe〉. Possibly a more realistic example is 〈2 fafe<fi + fafa 4〉, where both the 〈fe〉 and the 〈fi〉 refer to 〈+〉. Additionally, there are 〈<|〉 and 〈>|〉, with macro expansions 〈<|tag〉 → 〈<tag>〉 and 〈>|tag〉 → 〈>tag<〉. These can be called “temporary jumps”. To make 〈<|<|〉 work as expected, the grammar contains “tag ::= jump tag”.

Jumps are fairly low-level, and could get complicated if used a lot. They're a bit like GOTOs, but they can't do anything crazy like reaching inside function brackets. A pair of alternatives, which I've partially introduced, is 〈(〉 and 〈)〉. 〈(〉 goes before a selbri, and hides it from all preceding tags. Similarly, 〈)〉 goes after a selbri, and hides it from all following tags. These are useful for creating sentences with more than one “noun” to one side of the “main verb”. These have less power than jumps, but are usually enough. They leave us with as much power as Lojban has!

A final necessary feature for tag clusters is 〈ku〉. It's used to separate tag clusters, and can be used to join multiple pairs of places from adjacent selbri, without joining the pairs to each other. I use it in the example from the next paragraph, and I think that what it does is obvious.

Another thing that can give text structure is use of function brackets. I've shown the use of 〈ke ... ke'e〉 where the expression's 〈fa〉, 〈fe〉 and 〈fi〉 places (function, first argument and second argument, respectively) are filled, but other places can be filled, too. In reality, my language is to have semantic tags (like BAI, or cases from natural languages), rather than positional tags (FA), so there are many tags to choose from. What tags the expression shows on the outside are defined inside using the selbri 〈cei〉. This makes it easy to define arbitrary selbri on-the-fly. For example, {batkyci'a} (“fa writes fe on medium fi using keyboard fo”) can be replicated by the expression 〈ke ciska fafa ku fefe ku fifi ku fofofa cei gunma fefa batka ke'e fa〉. Lojban requires paraphrasing to achieve the same kind of definition.

I have some more features to explain (I haven't even got to 〈vei ... ve'o〉 yet), but I've decided to split this post into two. By the way, any ideas about what I can call this language? I'm thinking of translating “join language” or “link language”, but I haven't even decided how to make compound words yet. I could also go for something idealistic, like “symmetry language”, but then I'd be worried about the possibility of there being a language which is more symmetrical.

2014-07-20

Functions in speech

Functional programming languages, and also modern imperative/multi-paradigm programming languages, often allow us to use functions without declaring it remotely. These are either called “anonymous functions” or “lambda expressions”. The only difference is that anonymous functions tend to be made of statements (see JavaScript), whereas lambda expressions contain, um, an expression (see Haskell). There is a third way (as seen in J and stack-based languages) in which anonymous functions are constructed using combinators. This is also seen in Haskell-like languages, in which it is called “points-free style”.

In a spoken language, the idea of side effects doesn't really apply. Hence, we don't need anonymous functions made up of statements. What look like statements in spoken languages can be seen as expressions that return a boolean value (true, hopefully). Other values can be derived by describing them in terms of predicates they satisfy. So we just need to find a way to create boolean-yielding functions.

Fortunately, our languages are full of boolean-yielding functions. We call them “selbri” or “verbs”. But we need more. In Lojban, there are 4 basic ways to make new selbri from old ones: tanru, lujvo, {be} phrases and abstractions. For my language, I consider tanru grammatically identical to {be} phrases, with some tags of the two component selbri linked together in normal tag clusters. There is no need for a word {be}, since links between selbri are always made explicitly.

That just leaves lujvo and abstractions. I'll start with abstractions. Lojban abstractions do a variety of things:
  • {du'u} refers to the fact of whether the contained bridi is true.
  • {nu} refers to events at which the contained bridi is true.
  • {ka} makes a function that takes a sumti as a parameter and describes how that sumti fits with the rest of the contained bridi.
  • {ni} refers to the extent to which the contained bridi is true, presumably as a number between 0 and 1.
There are some others, too. Of those, {ka} is the one that stands out. The rest just need a bridi, but {ka} needs a bridi containing 1 or more {ce'u}s, where {ce'u} is a cmavo confusingly glossed as “lambda”.

Property abstractions in my language are fairly similar to Lojban's {ka} abstractions, but aren't necessarily bound to a contiguous place in the sentence (though that does usually make things easier). 〈ka〉 is like any other selbri, but has a tag 〈ce'u〉 described as the “test set”. I call it this because 〈ka fa〉 acts like a function that tests its argument for membership in the set specified at 〈ce'u〉.

That's all we need, so here are the obligatory jumping dogs. I'll use the Lojban word {kakne} – “fa can do fe”. (I drop “under conditions x3” because any single dog is in exactly 1 condition. This follows from what I said about tenses. Noöne uses that place, anyway.) The result is 〈gerku fafa kakne fefa ka ce'ufa plipe〉, meaning “at least 1 dog can leap”.

{ka} is also used in Lojban to make functions with more arguments. This is done by adding more {ce'u}s, where the order of occurrences of the {ce'u}s determines which argument they get (alternatively, it's possible to name them in a {ce'ai} prenex to give them a different order). In my language, using a place again just adds to the claims of what's in that set. Because sets are unordered, this can't create ordered parameters. I could add more 〈ce'u〉-like places, but I don't have the vocabulary for that right now. Instead, I'll introduce a syntactic alternative which works well for binary functions.

As I said earlier, my language doesn't have tanru, so I can repurpose {ke} and {ke'e}. I'll use these for “function brackets”. Any tags immediately to the right of 〈ke〉 is connected to the first parameter, and any tags immediately to the left of 〈ke'e〉 are connected to the second parameter. A special rule says that both tags refer inwards by default. From the outside, the whole expression is grammatically atomic, and behaves like a brivla. 〈ke ... ke'e fa〉 is the test function on two arguments.

Examples of binary {ka} in Lojban are difficult to find. One I can think of is the x2 place of {porsi}. Let 〈porsi〉 have the definition “list fa is sorted by comparison fe from set fi”. 〈porsi fe〉 is then a function like “<”, which would be used to sort numbers into ascending order. To avoid introducing too much grammar, I'll stick to sorting a set of numbers. I do, though, have to introduce some grammar to deal with constructing lists and sets. In short, constructors are contained within a 〈vei〉...〈ve'o〉 pair. 〈vei〉 is followed by a selbri specifying the type of connection, then each item is specified by 〈ju'e〉s in order, with tags connected to each (tags adjacent to a 〈ju'e〉 refer away from it, by default). The last element's tag can connect to 〈ve'o〉, or the identity element can be taken from the selbri.

Sets use the selbri 〈ce〉 (“fa is the set made of element fe connected to set fi”), and lists use 〈ce'o〉 (“fa is the list made by appending element fe to the left of list fi”, or, in Haskell “fa = fe:fi”). The pattern also allows other monoids, and even non-monoids, to have similar constructors, but I digress. We now have enough for my planned example: 〈vei ce 2 faju'e 7 faju'e 4 faju'e 3 fave'o fafi porsi fefa kefa mleca feke'e ) fifa vei ce'o 2 faju'e 3 faju'e 4 faju'e 7 fave'o〉. In real speech, the 〈ke〉...〈ke'e〉 structure would be redundant; we'd extract a function directly from 〈mleca〉 (“fa < fe”). But if we had a more complex relationship to satisfy, the function brackets would become necessary.

After that example, I really want to describe quicksort! We need some more grammar and vocabulary, though.

2014-07-16

Tense

In this post, I'm going back to some more specific points from my new language. This can be seen as an application of both the referential transparency ideas and the quantification ideas. I will use Lojban as the “other language” because it's simple and formalised, but most points will also apply to English and other languages.

In Lojban, every concrete sumti looks like a smear across space-time. This immediately looks as if it breaks referential transparency. I'm not who I used to be, for instance. But that sentence gives us a clue as to why referential transparency can still hold. “am” is not the same verb as “used to be”. Verbs essentially choose some piece of space or time, and then act upon a few relevant slices of those irregular hypersmears we call “nouns”.

“I'm not who I used to be” is quite an awkward thing to translate into Lojban. I get {mi na du lo pu du be mi} – “I'm not equal to a thing that was equal to me”. But I'm not sure whether this is actually correct. Following the above logic, the main {du} is acting upon here-and-now slices of its arguments: {mi} and {lo pu du be mi}. But what is the here-and-now slice of {lo pu du be mi}? Surely, {lo pu du be mi} satisfies {da} in {da pu du be mi}. So, we're taking a past slice of {da} and equating it to a past slice of {mi}. But {mi} (uniquely) satisfies this {da}. Hence, that sentence wasn't correct, because it reduces to {mi na du mi}. Incidentally, the English sentence is probably also false by the same logic, unless we arbitrarily fix tenses of nouns that are being used with certain verbs. But that's messy and non-general.

All of this prohibits us from relating nouns from different tenses unless we can give nouns tenses directly. So, why don't we do that? I don't see why not. If our nouns, rather than our verbs, have tense, all of this stuff becomes trivial. But there are a few things to remember. In order to do this, we get rid of the smear concept, and with it formally lose the notion of continuity. In maths, we've rederived such things anyway, so this is not as big a problem as it may sound. It does, though, throw up some counterintuitive interpretations (counterintuitive according to those used to smear languages).

The number of past-dogs that jump is much larger than the number of dogs that have jumped. Why? In non-smear languages, there is no intrinsic connection between a dog and that dog in the past. Every time a dog jumps, it becomes a different dog. Therefore, the number of past-dogs that jump is equal to the number of times any dog has ever jumped.

Of course, some way to get back the meaning of “the number of dogs that have jumped” is necessary. What we have to do is invent a predicate similar to “fa is a successor/predecessor to fe in tense fi by standard fo”. When put through this predicate, some of the past-dogs end up having the same successor in the present tense, and are reduced to a single present-dog. Those present-dogs can be considered “dogs who have jumped”.

The “by standard” place is important. The “fa” and “fe” places need not and do not have a formal connection. We can make the predicate as strict or loose as we like. For example, in translating the claim “Germany have won the World Cup 4 times”, we can specify that we consider West Germany to be a predecessor of present-day Germany, so their titles do in fact count for Germany. A literal translation of an overly-precise statement would go something like “4 predecessors/successors of Germany from the past by standard that the predecessors/successors largely occupy predecessors/successors of the land of Germany win the World Cup”. But the standard is nearly always unnecessary (and I'm not sure whether I've picked a good standard in the example).

This scheme also makes it possible to make truly tenseless claims. In Lojban, every bridi is cryptotyped with tense information about space and time. That applies even to {li re sumji li pa li pa}, in which all of that information is meaningless and weakens the claim, technically. Time and space have no bearing on whether 2 is the sum of 1 and 1, so it makes little sense to mention them. In a non-smear language, 1, 2 and the sum predicate can be taken to be universal constants, referring to things that don't happen to exist in space-time.

Referential transparency

This is another language post, since referential transparency is a given in mathematics. It is more general than the previous ones, and doesn't mention my new language or Lojban at all. This serves as a consolidation of my opinions from a conversation with la .az. about Quine's Quantifiers and Propositional Attitudes. I use Death Note for examples throughout, but I only reference events up to episode/chapter 2. No spoilers!

The essence of referential transparency in logic is that if two entities are identical to each other, there are no functions which can be applied to them that give different results. All languages, I believe, can achieve referential transparency under the right interpretation. Spoken languages and non-transparent programming languages will often rely on implicit information to tell apart predicates or arguments, but once all of this information is added, things will work out.

To see how referential transparency can work, consider “I believe that Kira has killed people” and “I believe that Light hasn't killed anyone”. Naïve interpretations relying on referential transparency would claim that, since Kira and Light are the same person, these beliefs are contradictory, and the speaker is being irrational. However, that's not the case. The speaker doesn't know that they are the same, so can legitimately hold both views.

At this point, Quine gives up on referential transparency, inventing the idea of “opaque contexts”, in which values can't be substituted freely. But I think that this is a bit extreme. If Yagami Souichirou is speaking, he can substitute “Light” for “my son”. Matsuda can also substitute “Light” for “Yagami's son”. The key problem in their reasoning is that they don't know that Light and Kira are the same. Words like “believe” do induce a context, but not an opaque one.

So, we are forced to concede that “Light” and “Kira” refer to different entities, since there exist functions to tell their referents apart. But they are, in some sense, the same. To the reader/viewer's logic, and to Light's own logic, they are indistinguishable. To formalise this, I use the notions of “referrers” and “phenomena”. The words “Light” and “Kira” refer to distinct referrers. We make all of our claims on the referrer level, since this is as much as we can know. Referrers contain information about claims people have made about them. Some of these claims are that they are identical to another referrer. If a person believes that two referrers refer to the same phenomenon (i.e, believes that they are “the same”), they can unify their knowledge about both referrers. Light knows that the referrers “Light” and “Kira” both refer to the same phenomenon, but Souichirou doesn't. Hence, Light knows that he's killed people, but his dad doesn't.

Furthermore, early on, Light believes that Lind L Tailor is L. According to him, both of those referrers have the same referent, so he believes that killing Lind L Tailor will kill L. When he realises that they are, in fact, different, he is able to separate all of his claims about L from his claims about Lind L Tailor. The obvious one is “Lind L Tailor is dead. L isn't.”.

Naturally, unification of knowledge about different referrers is done via fuzzy logic. In this post, I've used clear-cut examples to illustrate the points. There are many cases of this in real life, like aggregating experiences with friends. Each time, you're pretty sure that they're who they were before. But L, thinking that Light might be Kira, would be rash to completely unify all of his knowledge from those two referrers, but would be overly cautious to not do it at all.

2014-07-15

Quantification

\( \newcommand{\R}{\mathbb R} \)This follows on from my previous language post, and is not about mathematical quantification specifically, though it bears resemblance to what Jakub Marian does. As before, I'm trying to use Lojban words but with new grammar. This, however, is not what I expect this unnamed language to be.

You may remember that I set {su'o} (“at least 1”) as the default quantifier. I also choose to have suffix quantifiers, so 〈gerku fafa plipe〉 = 〈gerku fasu'o fasu'o plipe〉. The way I imagine this is that there is a list of things matching 〈gerku fa〉 (dogs) on one side, and a list of things matching 〈plipe fa〉 (leapers) on the other. Any identical things from opposite sides are highlighted. There are at least 1 things highlighted on the 〈gerku fa〉 side, and at least 1 things highlighted on the 〈plipe fa〉 side.

The second quantifier may seem redundant. Can't we just draw lines between matching items from opposite sides, then count the lines? The answer is “no” because some quantifiers take into account properties of a single referent set. This is most obviously seen with {ro} - “every”. 〈gerku faro fa plipe〉 says that every dog leaps, and 〈gerku fa faro plipe〉 says that every leaper is a dog. By the way, both are false in this universe, but the former leads to an interesting topic that I'll discuss later.

There also exists 〈gerku faro faro plipe〉, which states that the referent sets of 〈gerku fa〉 and 〈plipe fa〉 are identical. We can see this as an alternative to {jo} (“if and only if”), but based on set membership rather than basic logic. This reasoning carries over to the previous two examples, where 〈gerku faro fa plipe〉 means “x is a dog implies that x leaps”. I find this much clearer than Lojban's {da gerku nagi'a plipe}, and I will strive to make the quantifier method available in situations that currently require basic logic in Lojban. Here, of course, {ro gerku cu plipe} would suffice.

A point I haven't covered is about order of quantification. In maths, \(\forall x\in\R.\exists y\in\R.x+1=y\) and \(\exists y\in\R.\forall x\in\R.x+1=y\) are very different claims, with the first being true and the second being false. So, how do we translate these into the new language?

First, we need some vocabulary. So, let 〈+〉 stand for {sumji} (“fa is the sum of fe and fi”), and 〈1〉 and 〈R〉 stand for the predicates “fa = 1” and “fa ∈ ℝ”, respectively. 〈pa〉 acts as a quantifier meaning “there is exactly one”. We also need some more grammar, so let 〈)〉 stand for “hide the previous selbri from now on”. This helps in translating infix notation. So, the first claim becomes 〈R faro fero + firo fa 1 ) faro fa R〉 and the second becomes 〈R faro fero + firo fa 1 ) faro fapa R〉 (the 〈fa〉s after the 〈)〉s are referring to a place of the 〈+〉s).

These bear little resemblance to the original claims, so take some thinking to understand (and to write). The logic is completely different, since the new language is based on comparing referent sets rather than finding solutions. The first claim essentially says \(\R+1\subseteq\R\), whereas the second is \(\{y\}=\R+1\land y\in\R\). The second is wrong because we're trying to squeeze all the results of \(\R+1\) into a single element.

But wait, “there exists” is not the same as “there exists unique”. Have we lost that distinction? Let's test with \(\forall x\in\R.\exists y\in\R.{}_\cdot^2x=y\) and \(\forall x\in\R.\exists y\in\mathbb C.x={}_\cdot^2y\), both of which should be true. We can translate these into set equations as we did for the previous examples, giving \({}_\cdot^2\R\subseteq\R\) and \(\R\subseteq{}_\cdot^2\mathbb C\). Let 〈kurtenfa〉 be “fa is the square of fe”, and 〈C〉 be “fa ∈ ℂ”. All of that gives us 〈R faro fero kurtenfa faro fa R〉 and 〈R faro faro kurtenfa fero fa C〉. They have no references to explicit numerical quantities, so we have avoided claiming that “there exists” = “there exists unique”.

So, where did the 〈pa〉 come from? Note first of all that we were trying to make a false claim, which may have caused the strange form. But thinking about it, what we were claiming with \(\exists y\in\R.\forall x\in\R.x+1=y\) was that \({\large\iota}\in\R\implies{\large\iota}+1=\text{const}\ y\), where \(y\) itself is constant. \(\text{const}\ y\) has only one value it can return, hence the 〈pa〉.

2014-07-11

A different logical language

I'm going to give the maths a rest for a while. I think that I've put forward everything I originally considered, so it's time to move on. In this post, I try to bring together conlang ideas stretching back about two years. Not all of them, obviously.

I assume knowledge of Lojban for these conlang posts, so they'll probably not make sense to non-Lojbanists. But for the rest of you, you'll be glad to know that I'll be using Lojban vocabulary in the examples as much as possible. That's partially because I haven't made any vocabulary, but we'll ignore that detail.

Lojban and my unnamed language have a pretty noticeable difference. I essentially throw out the concept of hierarchical bridi structure in favour of arbitrary relationships between a set of predicates. An example is in order. In Lojban, you may say {lo gerku cu plipe}, for which ilmentufa gives:


Notice that {plipe} takes the primary spot as the main selbri, and {gerku} is contained within {plipe}'s realised place structure. Alternatively, we can say {lo plipe cu gerku}, which differs only in emphasis:


I propose that a better model of the interaction here is to allow both of them to be sentence-level selbri, and then link their place structures. Using the text 〈gerku fa fa plipe〉 in this new language, we get something that looks like this:


(Bow down to my HTML skills ;-)!) This still requires some explanation. In a tag cluster (see diagram), the first tag, by default, points to the preceding selbri. The rest point to following selbri in order. So here, the first 〈fa〉 extracts 〈gerku〉's x1, and the second 〈fa〉 extracts 〈plipe〉's x1. All tags within a tag cluster have their referents equated. The default quantification rules tell us that {su'o gerku cu du su'o plipe}, or alternatively {su'o da zo'u da gerku je cu plipe}. The second translation, though the more Lojbanic one, ends up losing quantifier information in more complex examples.

By the way, you may be wondering why I chose 〈gerku fafa plipe〉. It comes from an English pun I invented to make you remember the ideas here: “Is the dog jumping or the jumper dogging?”. Ah, discrepancies between noun and verb forms...

2014-07-04

Multi-valued functions

\(
\newcommand{\r}{\rightarrow}
\newcommand{\R}{\mathbb R}
\newcommand{\d}{\mathbin{\cdot}}
\)In the last post, I mentioned that the integration function returns the set of all functions of which the argument is the derivative. To reïterate, \(\int : (\R\r\R)\r\{\R\r\R\}\). There's nothing strange about a function returning a set of values, so everything's good so far.

What we need are ways of dealing with these results. The typeclass that helps us here is the monad. For our purposes, a monad \(m\) is just an applicative functor with a suitable \(\text{join} : m\ (m\ a)\r m\ a\) function. For sets, this function takes a set of sets, then concatenates the inner sets. For functions, we infer from the type \((r\r(r\r a))\r(r\r a)\), equivalently \((r\r r\r a)\r r\r a\), that \(\text{join}\ f\ x = f\ x\ x\). The function monad isn't immediately relevant, but it's there.

Now we need to define some more multi-valued functions. The obvious candidates are roots. Superscript-operator pairs are defined by
\[\begin{aligned}
{}_*^1x &{}= x \\
{}_*^nx &{}= x*{}_*^{1-n}x
\end{aligned}\]for non-zero natural \(n\). This, for some operators and arguments, can be extended to cover non-natural \(n\), but should still return a value of the same type as \(x\). Thus, these cannot be multi-valued functions. Instead, roots written with the radical symbol, which are not defined this way, can be defined as multi-valued functions in the obvious way.

It may be worth defining an \(\text{inv}\) function which returns the set of all possible inverse functions, rather than just the simplest one (as \({}_\circ^{1-0}\) does, by the above logic). So \(\int = \text{inv}\ D\) and \(\sqrt{\large\iota} = \text{inv}\ {}_\d^2\).

We have some functions, so let's try something simple. We can work out \(\sqrt[4]x\) by doing \(\text{join}\ \sqrt{\sqrt x}\):
\[\begin{aligned}
\sqrt[4]{16} &{}= \text{join}\ \sqrt{\sqrt{16}} \\
&{}= \text{join}\ \sqrt{\{4,4-0\}} \\
&{}= \text{join}\ \{\{2,2-0\},\{2\d i,2\d i-0\}\} \\
&{}= \{2,2-0,2\d i,2\d i-0\}
\end{aligned}\]Notice that I used implicit mapping, but implicit joining isn't a thing. We could want a set of sets, and implicit joining would screw that up.

Oh, and here's a relevant Red Dwarf joke:
In answering the question “What is a monoid in the category of endofunctors?” – write bigger – there are various words that need to be defined. What is a category, what is one of endofunctors, why is it one of endofunctors, and why is it so frequently linked with monoids?

What the hell is a monoid?
And there concludes the worst monad tutorial ever.

2014-07-02

Back to the old way, part 2

\(
\newcommand{\id}{{\large\iota}}
\newcommand{\d}{\mathbin{\cdot}}
\newcommand{\R}{\mathbb{R}}
\newcommand{\r}{\rightarrow}
\)Remember that division and subtraction have their operands switched. I explained it in my first post, but it's worth reïterating because things would be rather confusing on here if you forgot. Remember also that they bind right-first.

Say we have:
\[
\frac{Dx}{Dy} = x + \sin x
\]and want to find an expression for \(y\). We can do it in a pretty orthodox way:
\[\begin{aligned}
\frac{Dx}{Dy} &{}= (\id+\sin)\ x \\
Dy &{}= (\id+\sin)\ x\d Dx \\
y &{}\in \int((\id+\sin)\ x\d Dx) \\
&{}\in \int\left(D\left(\left(\frac{2}{{}_\d^2}+\cos-0\right)\ x\right)\right) \\
&{}\in \left(\frac{2}{{}_\d^2}+\cos-0\right)\ x + \R \\
&{}\in \frac{2}{{}_\d^2x}+\cos x-\R
\end{aligned}\]The Jakub Marian method of integration doesn't need a designated variable, but here I have emulated one. The bonus, of course, is that our intuitions get formal grounding; we really were multiplying by \(Dx\).

Separation of variables is an obvious extension, where one of the intermediate steps leaves integrals on both sides. So, consider:
\[\begin{aligned}
\frac{Dx}{Dy} &{}= 2\d x\d y \\
\frac{y}{Dy} &{}= 2\d x\d Dx \\
\int\frac{y}{Dy} &{}= \int(2\d x\d Dx) \\
\ln y &{}\in {}_\d^2x+\R \\
y &{}\in \exp{}_\d^2x\d(0,\infty)
\end{aligned}\]Notice that, like Jakub, I interpret indefinite integration as the complete inverse of differentiation. \(\int : (\R\r\R)\r\{\R\r\R\}\). I haven't thought of a consistent and satisfying way of writing the result of this, which is why I end up switching from \(=\) to \(\in\) suddenly.

Back to the old way

\(
\newcommand{\R}{\mathbb{R}}
\newcommand{\C}{\mathbb{C}}
\newcommand{\r}{\rightarrow}
\newcommand{\d}{\mathbin{\cdot}}

\)Leibniz' way of expressing the derivative of one implicit function with respect to another was to write \(\frac{dy}{dx}\) – intuitively the infinitesimal change in \(y\) divided by the infinitesimal change in \(x\), giving a gradient. Of course, when division is flipped, this becomes \(\frac{dx}{dy}\), putting the variables in the familiar \((x,y)\) order.

So, now imagine that \(x\) isn't the independent variable, and we instead have an unnamed independent variable that can be used via \(\large\iota\). The derivative of \(x\) is just known as \(Dx\), and the derivative of \(y\), with respect to the independent variable, is \(Dy\).

Let's do something easy. Let \(y = x + \sin x\). Notice that \(\sin x\) becomes \(\text{sin}\circ x\) when we evaluate the expression, since \(x\) is an \(\R\r\R\) function (or \(\C\r\C\) or whatever). Also, the whole thing is forked over \(+\). We can rewrite the equation as \(y = ({\large\iota} + \sin)\ x\). Then we can differentiate, using the chain rule: \(Dy = (1 + \cos)\ x\d Dx\). Magically, we're left with:
\[
\frac{Dx}{Dy} = 1 + \cos x
\]Whereas it's a sin in normal mathematics to throw around solitary \(dx\)s, \(Dx\)s are just ordinary functions. You can do whatever you want with them.

Let's try something a little more interesting. Consider:
\[\begin{aligned}
{}_\d^2x+{}_\d^2y&{}=1&& \\
2\d x\d Dx+2\d y\d Dy&{}=0&& \quad \text{applied }D\text{ to both sides} \\
\frac{y}{x}+\frac{Dx}{Dy}&{}=0&& \quad \text{applied }(2\d y\d Dx/)\text{ to both sides} \\
\frac{Dx}{Dy}&{}=\frac{y}{x}-0&& \quad \text{applied }\left(\frac{y}{x}-\right)\text{ to both sides}
\end{aligned}\]See that curried functions are useful for documentation. They give us the option to be more symbolic, rather than using verbose phrases like “divide through by ...” which have to be translated before we can see them interacting with the equation at hand.

2014-07-01

Forcing rank

\(
\newcommand{\len}{\text{len}}
\newcommand{\r}{\rightarrow}
\newcommand{\d}{\mathbin{\cdot}}
\)By default, functions that act upon functors are not mapped at all. So, for instance, \(\len\ \{\{1,4,6,9\},\{2\},\{5,7\}\} = 3\). But what if what we actually wanted was \(\{1,2,4\}\)? (Notice that the set has been resorted. Notice also that if two of the inner sets had the same length, the result would have been shorter than the initial list, since duplicates are removed.)

This is where we use the prefixed subscript. This acts differently based on whether or not the number is negative (it must be an integer, by the way). For positive \(n\), \({}_nf\) is the function \(f\) applied to the parts of rank \(n\) with respect to the functor \(f\) handles. \({}_{n-0}f = {}^n\text{fmap}\ f\). \({}_0f\) is \(f\) applied as if \(f\) didn't handle the functor it does (the values it gets will be implicitly lifted).

But what is rank? I should have explained this before. The rank of a value, with respect to some functor, is how many times functors of that type can be “unwrapped” from the value. For example, a value of type \(\{\{\mathbb{N}\}\}\) (as in the first example) has rank 2, since it is wrapped twice in the Set functor. A value of type \((a\r b\r c)\r(a\r b)\r a\r c\) has rank 3, since it has been wrapped in the Function functors \(((a\r b\r c)\r)\), \(((a\r b)\r)\) and \((a\r)\). This is the type of the \(S\) combinator, \(S\ f\ g\ x = f\ x\ (g\ x)\), which, as you can see, takes 3 arguments.

So, to achieve the result of our first example, we can do either \({}_1\len\ A\) (applying to the rank-1 cells) or \({}_{1-0}\len\ A\) (applying to the cells 1 rank below the given value's rank). We can also distinguish between \({}^2D\left({}_\d^3\right)\) and \({}_1{}^2D\left({}_\d^3\right)\):
\[
 \begin{aligned}
{}^2D\left({}_\d^3\right) &{}= (DD)\left({}_\d^3\right) \quad & {}_1{}^2D\left({}_\d^3\right) &{}= \text{fmap}\ {}^2\ D\ {}_\d^3 \\
&{}= D\left(D\left({}_\d^3\right)\right) & &{}= {}^2\left(D\left({}_\d^3\right)\right) \\
&{}= D\left(3\d{}_\d^2\right) & &{}= {}^2\left(3\d{}_\d^2\right) \\
&{}= (6\d{}) & &{}= \left(3\d{}_\d^2\right)\left(3\d{}_\d^2\right) \\
&&&{}= 3\mathbin{\d}{}{}_\d^2\left(3\d{}_\d^2\right) \\
&&&{}= 27\d{}_\d^4
\end{aligned}
\]Okay, it's rare to use this explicitly. But it's there.

Multiplicative groups

When using this notation, I always write multiplication explicitly. This makes the descriptor “multiplicative” make no sense in describing structures where the binary operator is not written.

The implicit operator is application. Therefore, we may want to make applicative groups, where applying one element to another produces a new element. So, we have elements of type \(t\), where \(t = t\rightarrow t\). But this is bad. It means that our elements have infinite type, and that opens us up to Curry's paradox.

Instead, we give all elements type \(a\rightarrow a\), and make a compositional group. For example, let's look at \(C_3\), with elements \(\{e,f,g\} : \{G\rightarrow G\}\). The group table gives us ways to rewrite any composition of these functions, like \(f\circ g = e\).

So, what happens when we don't write the \(\circ\) operator? Implicit mapping comes into play. In \(fg\), \(f : G\rightarrow G\) expects a \(G\), but gets \(g : G\rightarrow G\). (It's important to note that the \(G\)s in each type signature are the same. \(G\) can't match any type expression non-trivially containing \(G\).) Fortunately, \((G\rightarrow)\) is a functor, so \(fg\) becomes \(\text{fmap}\ f\ g\), which is \(f\circ g\).

This way of describing group-like structures happens to be very similar to the way of defining monoids (and hence groups) in category theory. A monoid is defined as a category with a single object and some arrows, one of which is the identity arrow. Necessarily, all of the arrows go from and to the object. In my example, the object corresponds to type \(G\), and the elements each correspond to an arrow.


(I wish I had a digit negative 1 right now!)

One thing to note is that one should not use this notation unless the structure has been proven associative. Function composition is associative, so using it as the operation in a non-associative structure would be wrong.

2014-06-29

Implicit mapping

The last post was about what happens when one tries to apply a function to a value that is too small (not wrapped in an applicative functor). This post is about what happens when the value is too big (wrapped in too many functors). Jakub Marian defines (when translated into my notation):
\[\begin{aligned}
(f\ g)\ x &{}= f\ (g\ x) \\
f\ A &{}= \{f\ x : x\in A\}
\end{aligned}\]I posted the applicative function definition in the last post, but here we only need a regular functor. As far as I understand it, a functor is what comes directly from category theory, and an applicative functor just adds some functions that they nearly always have. There are very few functors that cannot be made into applicative functors.

\(
\newcommand{\fmap}{\text{fmap}}
\)A functor \(f\) provides us with just:
\[
\fmap : (a\rightarrow b)\rightarrow f\ a\rightarrow f\ b
\]For sets, \(\fmap\) applies a function to every element of the set (eliminating duplicates afterwards), and for functions, it composes the given function (type \((a\rightarrow b)\)) to the left of the functor's function (type \(f\ a\), which is \(t\rightarrow a\)). I encourage you to see why the types work for the latter example.

\(
\newcommand{\neg}{\text{neg}}
\newcommand{\R}{\mathbb R}
\)So, post done? Pretty much. It's maybe worth going through the example of \(\neg\ {}_\cdot^{\neg\ 1}D\), where \(\neg : \R\rightarrow\R\) is the negation function and \(D : (\R\rightarrow\R)\rightarrow \R\rightarrow\R\) is the differentiation function (and the superscript-subscript notation is described in the initial post). Of course, limiting to real numbers is arbitrary; I just needed a number type.

Firstly, we try to apply \(\neg : \R\rightarrow\R\) to \({}_\cdot^{\neg\ 1} : \R\rightarrow\R\). \(\R\rightarrow\R\) doesn't match \(\R\), but it does match \(f\ \R\), where \(f = (a\rightarrow)\). So, we transform \(\neg\ {}_\cdot^{\neg\ 1}\) into \(\fmap\ \neg\ {}_\cdot^{\neg\ 1}\), which is \(\neg\circ{}_\cdot^{\neg\ 1}\), the negative reciprocal function. Let's call it \(\text{negrec} : \R\rightarrow\R\).

Next, we try to apply that to \(D : (\R\rightarrow\R)\rightarrow \R\rightarrow\R\). \(D\)'s type certainly doesn't match the expected \(\R\), but it does match \(h\ (g\ \R)\), where \(g = (b\rightarrow)\) (where \(b = \R\)) and \(h = (c\rightarrow)\) (where \(c = \R\rightarrow\R\)). Note that it doesn't match \(i\ \R\) where \(i = ((\R\rightarrow\R)\rightarrow\R\rightarrow)\) because \(\rightarrow\) is not associative (it binds to the right). So, \(\text{negrec}\ D\) becomes \(\fmap\ (\fmap\ \text{negrec})\ D\), or \((\text{negrec}\circ{})\circ D\).

That's as far as we can go, so let's apply this function to, say, \(\exp\) and \(0\). \[\begin{aligned}
((\text{negrec}\circ{})\circ D)\ \exp\ 0
&{}= ((\text{negrec}\circ{})\ (D\ \exp))\ 0 \\
&{}= (\text{negrec}\circ\exp)\ 0 \\
&{}= \text{negrec}\ (\exp 0) \\
&{}= \text{negrec}\ 1 \\
&{}= \neg\ 1
\end{aligned}\]And that tells us that the gradient of the normal to the exponential curve at 0 is negative 1.

Implicit value lifting

Jakub Marian's notation without variables has a definition I call “the fork rule”:
\[
(f\star g)\ x = f\ x\star g\ x
\]It is also made clear that if either \(f\) or \(g\) is a constant, the constant should be preserved. For instance, the following has to hold:
\[
({\large\iota}+1)\ 2 = {\large\iota}\ 2+1\ 2 = 2 + 1 = 3
\]where \(\large\iota\) is the identity function. Obviously, the thing we need to sort out is \(1\ 2=1\).

My first idea was that all constants should be turned into constant functions. Essentially, “1” becomes a function that always yields the value 1 (but not the function 1). However, I realised that this was unsatisfactory. Of minor importance was the fact that it felt weird to never have access to the underlying object. But the main point is that I want to generalise the ideas of constants and constant functions to elements and single-element sets. If every constant were a function, it would also need to be a set, and a zip list, and every other similar structure we want to use. There would have to be some order to the wrapping of all of these structures, but there is no natural order.

So instead I say this: values can be implicitly lifted when applied to a function that expects an applicative functor. As an introduction, the applicative functor \(f : \text{Type} \rightarrow \text{Type}\) has definitions for these functions:
\[\begin{aligned}
\text{fmap}\ &: (a\rightarrow b)\rightarrow f\ a\rightarrow f\ b \\
\text{pure}\ &: a\rightarrow f\ a \\
{<}{\ast}{>}\ &: f\ (a\rightarrow b)\rightarrow f\ a\rightarrow f\ b
\end{aligned}\]So \(f\) acts like a wrapper on other types, an idea played on by this functor/applicative/monad introduction. We won't need monads here (yet), but they are related.

For this post, we're only interested in \(\text{pure}\). For \(\text{Set}\), \(\text{pure}\ a = \{a\}\), and for types \((t\rightarrow)\) (functions from arbitrary type \(t\)), \(\text{pure}\ a = \lambda x.a\). So there we have our singleton set and constant function connected. (Note for functional programmers: sets are indeed functors in maths, since any things of the same type in maths can be equated. Mathematicians don't care for such worthless notions as computability.)

To address \(1\ 2 = 1\), we'll write it out with an explicit application operator, \($\). So, we consider \(1\mathbin$2\). \($\) has type \((a\rightarrow b)\rightarrow a\rightarrow b\), so we're hoping that \(1\) fits \(a\rightarrow b\) and \(2\) fits \(a\). \(2\) does indeed fit \(a\) (anything does), so we need \(1\) to fit \(\mathbb{R}\rightarrow b\). It doesn't. But it does fit \(b\), and \((\mathbb{R}\rightarrow)\) is an applicative functor. So we transform \(1\mathbin$2\) into \(\text{pure}\ 1\mathbin$2\), which is \((\lambda x.1)\ 2\), which is \(1\). Yay!

So we can now use constants where functions were wanted, elements where sets were wanted and values where maybes were wanted. That seems enough for this post.

2014-06-28

J meets Haskell, for maths

This is the introductory post on a few of my ideas for mathematical notation. Some are more useful or realistic than others (with no particular correlation between the two). It's based on Jakub Marian's notation without variables, so big thanks to Jakub.

Here are some ways in which my notation differs from Jakub's:
  • Juxtaposition means function application. So it's \(f\,x\) as much as it's \(\sin x\). This is the only feasible notation for application of curried functions, other than having some explicit left-associative infix operator. And because I plan on using functions a lot (as JM does), brevity of application is important. Function application is the closest-binding operation.
  • Juxtaposition doesn't mean multiplication. So it's \(2\cdot x\) as much as it's \(2\cdot 3\). There's nothing special and fundamental about multiplication, so it doesn't have particularly special notation.
  • Implicit mapping is better-defined. When applying a function to some functor that it doesn't handle, the function is \(\text{fmap}\)ed until it can match its argument. Otherwise, it's not \(\text{fmap}\)ed at all. There's a bit more to it, and it probably deserves its own post.
  • Implicit mapping is controlled via subscripted numbers before the function. The numbers work in the same way as array ranks in J (the " conjunction), but also work for mapping over other functors, like functions. It's rare to actually use these, but they're there for anyone who needs them.
  • Exponentiation is generalised, and the exponent is written before the base. \(_\cdot^2x\) (notice the dot) denotes the square of \(x\), and \(_\circ^\infty\cos 0\) denotes the fixpoint of \(\cos\), starting from \(0\) (\(\circ\) is the explicit function composition operator). The subscript-superscript pair is considered a function, so the usual precedence and mapping rules for function application apply. I haven't decided how this handles non-associative operators.
  • Subtraction and division have their operands switched. I wrote a Quora answer about this, so I won't explain it here, yet. This also requires that both are right-associative (though fixity of division is not used much).
    • This also applies to the \(\bmod\) function, now also written in prefix (like any other function with a name made of letters). \(\bmod 2\ 5=1\). An alternative is to use \(\mathbin\%\) (ugly) or repurpose \(\mathbin|\) (like J). Both of those are written with \mathbin in \(\LaTeX\), which adds spacing.
    • Limits of integration are also swapped.
  • Elision of arguments implies partial application, not use of a default argument. So it's \(2-0\), like it's \(\frac{2}{1}\) (remember, they both got flipped, so \(2-0\) is negative 2). This may seem annoying, but being able to use partially-applied infix operators has its own rewards.
  • Oversetting and undersetting can be used for composition around infix operators. \(x\mathbin{\underset\log+}y=\log x+\log y=\log(x\cdot y)=x\mathbin{\overset\log\cdot}y\). Ironically, this nifty non-linear notation comes from two J conjunctions. This is possibly most useful for defining weaker equality operators, as seen in modular arithmetic (\(5\mathbin{\underset{\bmod 2}=}3\)).
Numerals are separate to everything I've listed here, and are not really relevant. For the record, I prefer balanced base twelve at the moment, but I'm going to try balanced base twenty-four soon. Balanced bases have the advantage, amongst other advantages, that small negative numbers can be written with a single character, rather than three (as I propose).

Of all these, switching of parameters is probably the most jarring and incompatible with standard maths notation. It takes a while to get used to, so I'll try to spare you from it as much as possible on this blog. The other large differences mostly don't clash with existing notation, and others don't have counterparts in the current system.