^{1}

^{2}

^{1}

^{2}

^{*}

^{3}

^{4}

^{1}

^{2}

^{3}

^{4}

Edited by: Xiaochuan Pan, East China University of Science and Technology, China

Reviewed by: Carl Van Vreeswijk, Centre national de la recherche scientifique, France; Petia D. Koprinkova-Hristova, Institute of Information and Communication Technologies, Bulgarian Academy of Sciences, Bulgaria

*Correspondence: Zhijie Wang, College of Information Sciences and Technology, Donghua University, No. 2999 North Renmin Road, Songjiang District, Shanghai 201620, China

This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

An important question for neural encoding is what kind of neural systems can convey more information with less energy within a finite time coding window. This paper first proposes a finite-time neural encoding system, where the neurons in the system respond to a stimulus by a sequence of spikes that is assumed to be Poisson process and the external stimuli obey normal distribution. A method for calculating the mutual information of the finite-time neural encoding system is proposed and the definition of information efficiency is introduced. The values of the mutual information and the information efficiency obtained by using Logistic function are compared with those obtained by using other functions and it is found that Logistic function is the best one. It is further found that the parameter representing the steepness of the Logistic function has close relationship with full entropy, and that the parameter representing the translation of the function associates with the energy consumption and noise entropy tightly. The optimum parameter combinations for Logistic function to maximize the information efficiency are calculated when the stimuli and the properties of the encoding system are varied respectively. Some explanations for the results are given. The model and the method we proposed could be useful to study neural encoding system, and the optimum neural tuning curves obtained in this paper might exhibit some characteristics of a real neural system.

To some extent, a neural system can be viewed as an information processing system, where information from the environment is encoded by the system and then processed by another. Many neural encoding schemes are proposed, among which firing rate coding scheme has been extensively explored. Neural tuning curves, or stimulus-response curves, are often used to model the input-output relationship of neurons, where the neural coding scheme is usually rate coding. To construct such models, one needs to collect the firing rates of an isolated neuron presented by given inputs. The neuron is then treated as a “black box” and is fitted using the data with a certain function, i.e., one does not need to know the details of the underlying mechanisms of the neurons; he only needs to find a function to fit the input-output data well. This raises an important question here. That is, though these tuning curves fit the input-output data of the neurons well, why real neurons process information in such a way?

Information theory (Alexander and Frédéric,

There have been many studies on determining the tuning curves of the neurons using information theory. The method of entropy maximization of the information theory is used to determine the tuning curves of the neurons given that the distribution of the stimuli is known (Dayan and Abbott,

However, the studies on optimum tuning curves concerning all the three aforementioned factors are insufficient. The aim of this paper is to investigate what kind of neural tuning curves could make a neural encoding system with finite-time window have a high information efficiency, i.e., can convey more information about a set of stimuli with less energy consumption. This paper is organized as follows. In Section Model and Method, the model of the neural encoding system is described; a calculation method for calculating the mutual information for stimulus with variable steps is proposed and the definition of information efficiency is introduced. In Section Results, it is shown that Logistic functions are the optimum tuning curves of the neural system by analyzing the effects of the neuronal channel noise and the energy consumption on the optimum tuning curve and by comparing the values of the information efficiency obtained by Logistic functions and other functions. The relationship between the information efficiency and the parameters of the Logistic function is investigated, and the optimum combinations of the parameters for maximizing the information efficiency are also explored. Conclusions and discussions are presented in Section 4.

In this section, a finite-time neural encoding system based on the firing rate coding is presented. A method for calculating the mutual information of the encoding system is proposed.

Stimuli are inputted into a neuron (or a population of neurons), which encodes the stimuli into the firing rates. The strength of the stimuli (e.g., the light intensity) is supposed to be continuous and obeys Gaussian distribution, of which the probability density is described by:
_{min}, s_{max}];

The spike sequence is assumed to be Poisson process, as the neural responses are usually noisy and often modeled by Poisson statistics (Dayan and Abbott,

We use mutual information to characterize the amount of stimulus information encoded in the number of spikes emitted by the neuron. Let H be the full response entropy, which is described by
_{r} is the probability of a response r and is related to the conditional probability p(r|s) and the probability density p(

Let H_{n} be the noise entropy which is caused by the noisy nature of the neural response, which is calculated by
_{max} is the maximum firing rate of the neuron.

Then the mutual information can be obtained by

According to Equation (6), λ (s)T is the average number of spikes within time window T. In terms of Equations (3) and (5), p(r|s) and p(r) keep unchanged if F_{max} T keeps invariant (note that λ (s)T = F_{max}Tf(s)). Thereby H, H_{n}, and I_{m} keep unchanged if F_{max} T keeps invariant. Consider that a population of N neurons, where neuron ^{i}_{max}, is used to encode the stimulus. The N neurons are assumed to receive the same stimulus but response to the stimulus statistically independently. The number of spikes emitted by neuron _{i}, within the time window T, is a random variable of Poisson distribution with mean F^{i}_{max} f(s)T and variance F^{i}_{max} f(s)T. As the N neurons in the population respond to the stimulus independently, the total number emitted by the N neurons within the time period T, r = ∑_{i} n_{i} is also a random variable of Possion distribution with mean ∑_{i} F^{i}_{max} f(s)T and variance ∑_{i} F^{i}_{max} f(s)T. Therefore, the population encoding system with N neurons is equivalent to a one-neuron system with maximum firing rate being the summation of the N maximum firing rates of the N-neuron system, i.e. F_{max} = ∑_{i} F^{i}_{max} f(s). Therefore, we only discuss one-neuron system and treat F_{max} T as one parameter in the following analysis. Finite-time window means that T is not very large, implying that F_{max} T is not very large if the population size of the neurons is not very large either.

Since spike generation and transmission occupy the main part of the energy consumption in the brain (Zhu et al., _{r} p(r)r. We use an objective function that takes account of both the mutual information and the energy consumption to characterize the information encoding efficiency (denoted by I_{E}), which is written as

We determine the value of the parameter γ by defining that the value of the objective function should be zero if the neuron does “nothing” but just amplifies the input signals by a factor of TF_{max}/(s_{max} − s_{min}) and lets them pass through. For example, given a neural encoding system with s_{min} = − 2, s_{max} = 2, σ = 1, and TF_{max} = 300, we set the tuning curve of the neuron as (s + s_{min})^{*} TF_{max}/(s_{max} − s_{min}). This linear tuning curve does “nothing” except for amplifying the stimulus and taking a translation. For the given parameters s_{min}, s_{max} and σ, a sole γ could be determined. Here, we get I_{m} = 2.2177 and

To calculate the mutual information, we sample the stimulus strength into discrete points as s_{i}, i = 1, 2, 3, … M. The stimulus strength of s_{i} corresponds to the firing rates F_{max} f(s_{i}). The response of the neuron will be a random variable obeys Poisson distribution with mean _{max} Tf(s_{i}) and variance F_{max} Tf(s_{i}). The conditional probability of the discrete version p(r_{j}|_{i}) will be calculated as _{max} Tf(s_{i})) times of multiplication to obtain H_{n}. Therefore, if M and F_{max} Tf(s_{i}) are large (note that F_{max} T may be the summation of the N maximum firing rates of the N-neuron system, thereby F_{max} Tf(s_{i}) may be large if the population size is large), large amount of calculations is needed, especially when we search for the optimum parameter combinations to maximize I_{E} (in this case, we need to calculate I_{E} under different values of parameter combinations). To reduce the calculation burden, we propose a sampling scheme with variable step size in this paper. We sample f(s) into discrete points f(s_{j}), j = 0, 1, 2, …, M. Δf(s_{j}) = f(s_{j}) − f(s_{j − 1}). Δf(s_{j}) can be different from Δf(s_{i}) for i ≠ j, which is determined by follows.

As H_{n} = ∑_{i, j} p(s_{i}) p(r_{j}|_{i}) log^{p (rj|si)} and H = ∑_{j} p(r_{j}) log^{p (rj)} (r_{j} represents the neuronal response of _{j} ≫ F_{max} Tmax(f(s_{i})), then p(r_{j}|_{i}) ≈ 0 and p(r_{j}) ≈ 0. Therefore, we limit the range of r as 0 < r < [2maxϵ(f(s_{i}))] in this paper ([.] means getting the integer part of the number). Furthermore, as the larger F_{max} Tf(s_{i}) is, the closer the neighboring conditional probabilities become (For example, if F_{max} Tf(s_{i}) is very large, p(_{j − 1}|_{i}) ≈ p(_{j}|_{i}) ≈ p(_{j + 1}|_{i})). Based on this observation, we propose a sampling scheme with variable step size. When F_{max} Tf(s_{i}) is small (f(s_{i}) < 1 in this paper), we let Δf(s_{j}) = f(s_{j}) − f(s_{j − 1}) = h/F_{max} T. _{max} Tf(s_{i}) ≥ 1, we let. _{j}), j = 0, 1, 2, …, R. Accordingly, we obtain the discrete points of the stimuli s_{j} which produce f(s_{j}). Thus, the continuous variable of the stimuli, s, is discretized. Owing to the sampling scheme with variable step size and the limitation of the range of the value of _{j}, the computational efficiency is greatly improved.

A very important question is what kind of tuning curves are the optimum tuning curves for the neural coding system. The expected shape of the neural response distribution when there is no noise in the neuronal channel can be obtained easily, basing on the fact that neurons with tuning curves resulting from entropy maximization have maximum mutual information. It is known that the tuning curves corresponding to the integral of the probability density of the stimulus (see Figure

_{min} = − 2, s_{max} = 2, σ = 1,

When the neuron channel is noisy, maximum entropy cannot lead to maximum mutual information, i.e., histogram equalization of the neuronal responses (each neuronal response has the same probability) cannot lead to maximum mutual information necessarily. Furthermore, histogram equalization cannot lead to least energy consumption as well. Then, what is the optimum tuning curve if both of noise and energy consumption are considered?

If the noisy channel of neurons is Gaussian and independent of the inputs, then maximum entropy leads to maximum mutual information if energy consumption is neglected. The estimate for λ (s), λ_{est} = r/T, will be a Gaussian variable with mean λ and variance λ /T. Its square root will have mean _{n} = ∑_{i} H_{n} (s_{i}) with _{j}|_{n} (s_{i}). Therefore, ^{θ} may be the optimum tuning curve if both noise and energy are considered.

However, it is interesting that _{max}, and T, which we will discuss in the next section.

θ | 1.5 | 1.5 | 1.5 | 2 | 2 | 2 | 2.5 | 2.5 | 2.5 | |

ϵ | 0.1 | 0.3 | 0.5 | 0.1 | 0.3 | 0.5 | 0.1 | 0.3 | 0.5 | |

ε | 0.09 | 0.27 | 0.45 | 0.08 | 0.25 | 0.42 | 0.07 | 0.23 | 0.40 | |

μ | 0.05 | 0.15 | 0.3 | 0.1 | 0.3 | 0.5 | 0.15 | 0.4 | 0.6 |

To check whether the Logistic function is the best tuning curve for the information efficiency, other types of tuning curves are also adopted and comparisons are taken. Let's considering power functions f(s) = α + ^{β}, which are commonly used forms for tuning curves (Poirazi et al.,

We fix TF_{max} = 300 and stimuli variance σ = 1 and carry out simulations for the two kinds of tuning curves. The simulations cover the whole parameter space spanned by ε and μ and the results are plotted in Figure _{E}. Therefore, it is rational that we compare the height of the two plateaus to identify the better tuning curves for information efficiency. It can be seen that the plateau of information efficiency resulting from the Logistic function is higher than 6, while the one resulting from power function (Equation 10) is lower than 5. Therefore, we conclude that the Logistic function is a better tuning curve for information efficiency. Simulations with other kinds of tuning curves, exponential functions and polynomial functions, are also carried out (results not shown), and the Logistic function is better than these functions as well. We also carried out simulations when stimuli variance σ is varied. The information efficiencies corresponding to various stimuli distributions for Logistic functions are also higher than those found by other kinds of functions (the results not shown in this paper). Therefore, we can conclude that Logistic functions are the best tuning curves for information efficiency.

_{max} = 300, other parameter values are set the same as those in Figure

It is clearly shown in Figure _{E} increase with the increasing of one of the parameter (ε or μ), reach the peak, and then decrease, if the other parameter is fixed. To reveal the relationship between the information efficiency and the two parameters more clearly, we first fix one parameter and then vary the other parameter. Figure _{E} on parameter ε (see Figure _{m} (It is worthy of noting that I_{E} is not measured in unit of bit as I_{m}, and the negative value of I_{E} means that the corresponding tuning curve is even worse than the one just linearly amplifying the stimuli). In short, if μ is fixed, the full entropy is sensitive to the parameter ε The curves for the full entropy, mutual information and the information efficiency have the same shape.

Figure

We further explore the optimum parameter combinations of ε_{m} and μ_{m} to maximize the information efficiency. As discussed in the first subsection of this section, it is intuitive that the optimum tuning curve is the curve of the integral of the input probability if T is very large and γ = 0. Thereby ε_{m} is the steepness of the logistic function that fits this integral curve, and μ_{m} = 0. Therefore, ε_{m} will be large if the input probability distribution is flat (i.e., σ is large), while ε_{m} will be small if σ is small. On the other hand, if T is small and γ is large, then functions of _{m} will be small and μ_{m} will be large if T is small and γ is large. These intuitive observations are confirmed by simulation results shown by Figures

_{m} and μ_{m} on σ and the corresponding maximum information efficiency_{m} and μ_{m} on σ;

Figures _{m} and μ_{m} on the parameter σ and the corresponding maximum information efficiency. It is shown that with the increase of the variance of stimulus distribution σ, the optimum parameter value of ε_{m} increases monotonously. The maximum information efficiency is low when σ is small (see Figure

Figure _{m} on the parameter γ. It is shown that if γ is large, i.e., the energy consumption is heavily weighted, then μ_{m} is large. Namely, μ_{m} increases with the increase of. γ. ε_{m} is insensitive to the parameter value of γ.

_{m} on γ. The parameter values are set the same as those in Figure

Figure _{m} and μ_{m} to the parameter T, namely TF_{max} (see the explanation of this joint parameter in Section Model and Method). It is found that the optimum parameter values of ε_{m} and μ_{m} increase if TF_{max} gets increased. As the variance of λ_{est}, λ (s)/T, approaches to 0 when T is very large, ε_{m} will be exactly the steepness of the Logistic function that fits the integral of input probability (around 0.41 according to Figure _{m} will be much smaller than 0.41 when T is very small due to the noise effect. Therefore, the value of ε_{m} increases with the increasing of TF_{max}. Energy consumption is very sensitive to the parameter T. It is proportional to T if other parameters keep invariant. To save the energy thereby to increase the information efficiency, μ_{m} needs to be increased with the increasing of T (Note that μ_{m} is very sensitive to the energy consumption according to Figure _{m} vs. TF_{max} in Figure _{max} gets larger, the according maximum information efficiency also gets larger (see Figure

_{m} and μ_{m} _{max} and the corresponding maximum information efficiency_{m} and μ_{m} on TF_{max};

We use information theory to search the optimum neural tuning curves to maximize the information efficiency. The information efficiency considered in this paper concerns three factors, i.e., mutual information, coding time window and energy consumption.

We proposed a finite-time neural encoding system, where the spike sequence of the neuron corresponding to a stimulus obeys Poisson process and the external stimuli obey norm distribution. We also propose a calculation method based on the variable sampling step to calculate the mutual information and the information efficiency. The effects of the neuronal channel noise and the energy consumption on the optimum tuning curve are analyzed and the calculations of the mutual information and the information efficiency are carried out. It is found that the Logistic functions are the best tuning curves in the sense that the information efficiency resulting from Logistic functions is higher than that resulting from other kinds of functions. Then we study the relationship of the information and information efficiency of the neural system with the parameters of Logistic tuning curves. It is revealed that the parameters representing the steepness of the Logistic function (ε) relates more closely with the full entropy, while the parameters representing the location of the function in the horizontal axis (μ) relates more closely with the noise entropy and energy consumption. The curves for the full entropy, mutual information and the information efficiency have the same shape if the parameter representing the location is fixed, while a Logistic function with its location being shifted a little to the right side of the horizontal axis has higher information efficiency if the parameter ε is fixed. We further explore the optimum combinations of the parameter values of the Logistic tuning curve for maximizing the information efficiency when the properties of the stimuli and the neural system vary. It is shown that with the increase of the variance of stimulus distribution, the optimum parameter value of parameter representing the steepness (ε_{m}) increases monotonously; ε_{m} increases when the encoding time window or maximum firing rate of the neuron gets larger; and μ_{m} increases with the increase of γ. Our results consist with the fact that Logistic functions, which could fit experimental data very well in many neural experiments, may be the actual tuning curves in many real neural systems (Dayan and Abbott,

In this paper, we used Poisson process to model the output of noisy rate-coding neurons. The result in this paper can be extended to more real neural models, for example, Poisson process with absolute refractoriness (Dayan and Abbott,

_{max} = 20, ε = 0.25, μ = 0.5, σ varies from 0.2 to 2.

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

This research is supported by the National Natural Science Foundation of China (Grants Nos. 11472061, 70971021, 71371046, 61203325), Shanghai Rising-Star Program (No. 14QA1400100), “Chen Guang” project supported by Shanghai Municipal Education Commission and Shanghai Education Development Foundation (No. 12CG35), Ph.D. Program Foundation of Ministry of Education of China (No. 20120075120004), the Fundamental Research Funds for the Central Universities (No. 2232013D3-39).