the general goal of an autoencoder is to reduce the dimensionality of a dataset by transforming each sample into a very compact code. There are no specific rules in modeling the architecture and your comment is quite interesting. Symmetry is not a necessary condition when, for example, the goal is to force the network to extract all the main features and to generate very compact (and sometimes, also sparse) codes (which are generally represented as dense layers) that can be reconstructed using a smaller deconvolutional network.

The quality is impacted by all dimensions, so if you increase the number of weights (adding convolutions or deconvolutions), the results are likely to be more accurate. However, an auto-encoder stores part of the “knowledge” in the network itself, so there should be always a trade-off between precision and complexity.

Mine was a very basic example that everybody can improve by increasing the complexity. The goal was to show the “dynamics” of a simple auto-encoder without analyzing all the influencing factors (e.g. you can add an L1 loss on the code layer to force sparsity). If you’re interested, you can find some examples in the repository https://github.com/PacktPublishing/Mastering-Machine-Learning-Algorithms (Chapter 11) where I show more complex standard auto-encoders, denoising auto-encoders, sparse-autoencoder and variational auto-encoders.

]]>I am new to convolution auto-encoders so my questions will probably sound very basic and I apologize for this.

The decoder and encoder in your implementation are very dissymmetrical.

conv1 is 32 30×30 maps, which translates into a flattened 28800 dimension structure (a huge increase from the original 3072-dimension image).

Then there is a HUGE dimension reduction to the 128 dimension code layer – that’s 225-to-1 !

But then, the next fully connected layer (code_output) expands to 30*30*3 dimensions (that is to say, roughly 10 times smaller than on the other side)

That layer then deconvolves to the original image size of 32*32*3 dims.

would it make sense to have the 32 30×30 maps on the decoder side too and link the weights of the deconvolution layer to those of the convolution layer ? (following http://people.idsia.ch/~ciresan/data/icann2011.pdf)

wouldn’t it improve the quality of the learning to use 2-3 convolution layers with pooling in between, instead of fully connected layers in the middle, to avoid the dramatic dimension reduction mentioned above ?

Looking forward to reading you

best

it clearly means that the list/array contains fewer elements than the value reached by the index. Check the dimensions (using x.shape for arrays or len(x) for lists) before starting the loops or using indexes. ]]>

I have this error please t https://uploads.disquscdn.com/images/93066cba175391f7263163b9c8115ba436eff9332276c412cfe0dcd37e2a9854.png ]]>

I highly recommend studying the basic concepts of Keras, otherwise, it’s impossible to have the minimum awareness required to start working with the examples.

]]>would you please tell me how many hidden layer did you use in your model?

how can i realize that?

thank you man ]]>

in a convolutional network, it doesn’t make sense talking about neurons. In this case, there 8 layers (separated by a dropout one) with 32 (3×1) kernels (with ELU activation), followed by 2 dense Tanh layers with 256 neurons and a softmax output layer with 2 units.

The number of layers can be analyzed in many ways:

- Experience
- Validation
- Grid-search

In general, it’s helpful to start with a model with smaller models, checking the validation accuracy, overfitting, and so on, and making a decision (e.g. adding new layers, increasing or decreasing the number of units, adding regularization, dropout, batch normalization, …). The golden rule (derived from the Occam’s razor) is to try to find the smallest model which achieves the highest validation accuracy. An alternative (but more expensive) approach is based on a grid-search. In this case, a set of models based on different parameters are trained sequentially (or in parallel, if you have enough resources) and the optimal configuration (corresponding to the highest accuracy/smallest loss) is selected. Normally this approach requires more iterations because the initial grid is coarse-grained and it’s used to determine the sub-space where the optimal parameter set is located. Then, several zooms are performed in order to fine-tune the research.

]]>In your architecture how many hidden layer did you use? and the number of neuron in each layer is 32?? am i right?

how can you know which number of layer would be beneficial for your model?

thanks alot