Complex applications made simple
FreeSWITCH removes much of the complexity from advanced applications. Let's look at two examples of a more complex application.
The first application we will discuss is the voicemail application. This application is useful to add right after the bridge application as a second option, executed in cases where the call was not completed. We can do this with one of those special variables that we were discussing earlier. Let's look at a version of our last extension that also allows us to leave a voicemail:
<extension name="example 4">
<condition field="destination_number" expression="^2000$">
<action application="set"
data="hangup_after_bridge=true"/>
<action application="bridge" data="user/2000"/>
<action application="voicemail"
data="default ${domain} 2000"/>
</condition>
</extension>
Here, we see two uses of channel variables. First, we set hangup_after_bridge=true
telling the system to hang up once we have have successfully bridged the call to another phone and to disregard the rest of the instructions. We are using the domain variable in brackets prefixed with a dollar sign, ${domain}
. This is a special variable that defaults to the auto-configured domain name, which comes from the XML configuration.
In this example, we check if someone is dialing 2000
. We then try to bridge the call to the user which endpoint is registered to extension 2000
. If the call fails or if there is no answer (for example, if the bridge attempt has failed, so we do not execute the hangup
after the bridge), we will continue to the next instruction, which is to execute the voicemail application. We provide the information the application needs to know (for example, which domain the voicemail belongs to) and which extension the voicemail is for, so the application knows how to handle the situation. Next, the voicemail application plays the pre-recorded greeting or generates one using the Say module's interface, which we briefly discussed earlier. It then plays short sound files one after each other to make a voice say something like The person at extension 2 0 0 0 is not available, please leave a message. Next, mod_voicemail
prompts you to record a message. As an additional feature, if you are not convinced with your recording, you can listen and re-record it as many times as you wish. Once you finally commit, a FreeSWITCH MESSAGE_WAITING
event is fired into the core event system queue, which is picked up by mod_sofia
by way of an event consumer, and the event information is translated into SIP-in this case a SIP NOTIFY
message that lets the SIP phone know that there is a message waiting. A blinking lamp (Message Waiting Indicator (MWI)) lights upon the receiving phone.
In this example, not only have we seen how to play a greeting, record a message, and transform it into a voicemail for a user, we have also met an important part of the FreeSWITCH core-the event system. The FreeSWITCH event system is not an additional module interface like the previous examples, it is a core engine feature that you can use to bind to named events and react accordingly when an event is received. In other words, throughout the FreeSWITCH core, there are events that are sent and received. Modules can bind to (that is listen for) various events. They can also fire events into the event engine; other modules can listen for those events. You can think of it as similar to other queuing systems such as RabbitMQ (actually, there is a module to interface the internal event system of a FreeSWITCH server with RabbitMQ, so you can integrate it into an enterprise queuing system, and/or having multiple FreeSWITCH servers be parts of the same big, distributed queue). As we discussed, the Sofia SIP module binds to (subscribes to) the event designated for MESSAGE_WAITING
information. This allows our mod_voicemail
module to interact with mod_sofia
without either system having any knowledge about the other's existence. The event is blindly fired by mod_voicemail
(fire and forget, in military parlance), intercepted (received, because has subscribed to) by mod_sofia
, and translated into the proper SIP message-all courtesy of the event system.
There are several challenges with such a complex system of concatenating sounds when considering all of the possible languages it may need to support, as well as what files to play for the automated messages and how they are strung together. The Say module supplies a nice way to string files together, but it is limited to something specific, such as spelling a word, counting something, or saying a certain date. The way we overcome this is by defining a more complex layer on top of the Say module called Phrase Macros. Phrase Macros are a collection of XML expressions that pull out a list of arguments by matching a regular expression and executing a string of commands. This is very similar to how the XML Dialplan works, only custom-tailored for IVR scenarios. For example, when mod_voicemail
asks you to record your message, rather than coding in the string of files to make it say what you want, the code just calls a phrase macro called voicemail_record_message
. This arbitrary series of sound bites is defined in the Phrase Macro section in the XML configuration allowing us, the administrators, to edit the phrase without modifying the Voicemail IVR program:
<macro name="voicemail_record_message">
<input pattern="^(.*)$">
<match>
<action function= "play-file"
data="voicemail/vm-record_message.wav"/>
</match>
</input>
</macro>
When mod_voicemail
executes the voicemail_record_message
macro, it first matches the pattern, which, in this case, is to match everything, because this particular macro has no use for input (that is, whatever input you give it, is not used). If the macro did use the input, the pattern matching could be used to play different sound bites based on different input. Once a match is found, the XML match tag is parsed for action tags just like in our Dialplan example. This macro only plays the file vm-record_message.wav
, but more complicated macros, like the ones for verifying your recording or telling you how many messages you have in your inbox, may use combinations of various Say actions and play many different audio files. Phrase Macros are discussed in detail in Chapter 6, XML Dialplan, and used extensively in Chapter 8, Lua FreeSWITCH Scripting.
Here too, we can see co-operation between various parts of FreeSWITCH architecture: the phrase system, the audio file, and the Say modules loaded by the core are used together to enable powerful functionalities. The Say modules are written specifically for a particular language or voice within a language. We can programmatically request to say the current time and have it translated into Spanish or Russian sounds by the proper Say module based on input variables. The Phrase Macro system is a great way to put a layer of abstraction into your code, which can be easily tweaked later by system administrators. For example, if we wanted to make a small IVR that asks us to dial a four-digit number, then reads it back and hangs up, we could make one macro called myapp_ask_for_digits
and the other called myapp_read_digits
. In our code, we would execute these macros by name-the former when it is time to ask for the digits and the later to read back the digits by passing in the value we entered. Once this is in place, a less-experienced individual (for example, a local administrator) could implement the XML files to play the proper sounds. She can use the Say modules to read back the number, and it should all be working in multiple languages with no further coding necessary. Voicemail is just one example of using FreeSWITCH as an application server. There are endless possibilities when we use FreeSWITCH to connect real-time communication with computers.
Multi-party audio/video conferencing
Another important feature of FreeSWITCH is delivered by the mod_conference
conferencing module. The mod_conference
provides dynamic conference rooms that can bridge together the audio and video from several users. It may mix video streams together, applying CG (computer graphics) transformations to them, such as composing a live feed of different conference participants together, over imposing a caption with the name and role to each users' video stream, sharing the screen of each participant computer (for example, a PowerPoint presentation), and so on. Also, a real-time chat can be added to the conference, so participants can exchange text messages out of band from the main audio/video stream. Obviously, this same module can also be used for plain regular audio conference calls.
Each new session that connects to the same conference room will join the others, and instantly be able to talk and see all of the other participants at the same time (as per the whim of the conference admin, who can choose who to display, who can talk, and so on). Using an example similar to the one we used for bridging to another phone, we can make an extension to join a conference room:
<extension name="example 4">
<condition field="destination_number" expression="^3000$">
<action application="conference" data="3000@default"/>
</condition>
</extension>
This is as simple as bridging a call, but with a conference application many callers can call the same extension (3000
in this case) and join the same conference room. If three people joined this conference and one of them decides to leave, the other two would still be able to continue their conversation.
The conference module also has other special features, such as the ability to play sound or video files or text-to-speech to the whole conference, or even to a single member of the conference. As you may have guessed, we are able to do this by using the TTS and video/sound file interfaces provided by their respective modules. The smaller pieces come together to extend the functionality without needing knowledge of each other.
The conference module also uses the event system in an additional way, employing what are called custom events. When it first loads, a module can reserve a special event namespace called a subclass. When something interesting happens, such as when a caller joins or leaves a conference, it fires those events on the CUSTOM
event channel in the core queue. When we are interested in receiving such events, all we have to do is subscribe to the CUSTOM
event by supplying an extra subclass string, which specifies the specific CUSTOM
events we are interested in. In this case, it is conference::maintenance
. This makes it possible to look out for important things such as when someone joins or leaves the conference, when they start and stop talking, when they are displayed on video, or what video layout (screen disposition) is currently in use. Conferencing is discussed in detail in Chapter13, Conferencing and WebRTC Video-Conferencing.
FreeSWITCH API commands (FSAPI)
Another very powerful FreeSWITCH concept is the FSAPI. Most API commands are implemented in mod_commands, and almost all other modules add some to the commands that are executable via FSAPI. FSAPI mechanism is very simple-it takes a single string of text as input, which may or may not be parsed, and performs a particular action. The return value is also a string that can be of any size, from a single character up to several pages of text, depending on the function that was called by the input string. One major benefit of FSAPI functions is that a module can use them to call routines in another module without directly linking into the actual compiled code (thus avoiding sudden incompatibilities and crashes). The most egregious example is the command-line interface of FreeSWITCH or CLI, which uses FSAPI functions to pass FreeSWITCH API commands.
Here is a small example of how we can execute the status FSAPI command from the FreeSWITCH CLI:
What's really happening here is that when we type status and press the Enter key, the word status
is used to look up the status FSAPI function from the module in which it is implemented. The underlying function is then called (passing it the arguments if they were typed, in this case none), and the core is queried for its status message. Once the status data is obtained, the output is written to a stream that prints a string.
We have already learned that a module can create and export FSAPI functions that can be executed from anywhere such as the CLI. But there's more. Modules can also be written to execute commands via the FSAPI interface and then send the results over a specific protocol. There are two modules included in FreeSWITCH that do just that-mod_xml_rpc
and mod_event_socket
(discussed in Chapter 10, Dialplan, Directory, and ALL via XML_CURL and Scripts, and Chapter 11, ESL - FreeSWITCH Controlled by Events respectively). Consider the example of mod_xml_rpc
. This module implements the standard XML-RPC protocol (Remote Procedure Call via XML strings) as a FreeSWITCH module. Clients using whatever standard XML-RPC interface can connect to FreeSWITCH and execute FSAPI commands. So a remote client could execute an RPC call to status, and get a similar status message to the one we saw in the previous example. This same module also provides FreeSWITCH with a listening web server, which allows FSAPI commands to be accessed froma direct URL link. For example, one could point a browser to http://example.freeswitch.box:8080/api/status
to execute the status
command directly over HTTP. By using this technique, it's possible to create FSAPI commands that work similar to a CGI, providing a dynamic web application that has direct access to FreeSWITCH internals (for a more advanced HTTP integration, you may want to check the HTTAPI module in Chapter 12, HTTAPI - FreeSWITCH Asks Webserver Next Action).
As we have shown, the FSAPI interface is very versatile. Now we know it can be used to provide a CLI interface, a way for modules to call functions from each other, and a way to export HTTP or XML-RPC functions. There is still one more use for FSAPI functions that we have not covered. We touched briefly on the concept of channel variables earlier, noting that we can use the expression ${myvariable}
to get the value of a certain variable. FSAPI functions can also be accessed this way in the format ${myfunction()}
. This notation indicates that the FSAPI command myfunction
should be called, and that the notation should be replaced with the output of that function call. Therefore, we can use ${status()}
anywhere when variables are expanded to gain access to the output of the status
command. For example:
<action application="set" data="my_status=${status()}"/>
The value placed in the my_status
variable will be the string output from the status
command.
Most FSAPI commands can be easily accessed using all of the ways we have discussed. Some commands only make sense when accessed via a particular method. For instance, if we made an FSAPI command that produced HTML intended to be accessed with a web browser, we would probably not want to access it from the CLI or by referencing it as a variable. But, never say never, there are cases where it can be useful, and you have the flexibility to do it.
We discussed many of the fundamental components of the FreeSWITCH core and how they interact with each other. We have seen how the event system can carry information across the core to the modules, and how the XML Dialplan can query the XML registry for data. This would be a good time to explain the XML registry a bit more. The XML registry is the XML tree document that holds all of the critical data that FreeSWITCH needs to operate properly. FreeSWITCH builds that document by loading a file from your hard drive and passing it to its own pre-processor. This pre-processor can include other XML documents and execute other special operations, such as setting global variables. Global variables will then be resolved by FreeSWITCH when they're used further down in the document tree.
Once the entire document and all of the included files are parsed, replaced, and generated in a static XML document, this final static document (with all global variables substituted for) is loaded into memory. The XML registry (tree) is divided into several sections- configuration, dialplan, directory, chat plan, languages, phrases, etc. The core and the modules draw their configuration from the configuration section. The XML Dialplan module draws its Dialplan data from the dialplan section. SIP and Verto authentication, user lookup, and the voicemail module read their account information from the directory section. The Phrase Macros pull their configuration from the phrases section. If we make a change to any of the XML files on the disk, we can reload the changes into memory by issuing the reloadxml
command from the CLI If we change the values assigned to one of the global variables, we will need to restart FreeSWITCH to apply the new value, reloadxml
will not be enough.
Scripting language modules
Scripting language modules embed a programming language like Lua, JavaScript, Perl, C#, and so on, into FreeSWITCH, and transfer functionality between the core and the language's runtime. This allows things like IVR applications to be written in that scripting language, with a simple interface back to FreeSWITCH for all the heavy lifting. Language modules usually register into the core with the application interface and the FSAPI interface and are executed from the Dialplan. Language modules offer lots of opportunities and are very powerful. Using language modules, you can build powerful real-time communication applications in a standard programming language you already know, using its libraries for data manipulations and legacy interfacing.
Understanding all of these concepts right off the bat is far from easy, and as maintainers of the software, we do not expect most people to have everything just click. This is the main reason that every new layer we put on top of the core makes things simpler and easier to learn. The demonstration configuration of FreeSWITCH is the last line of defense between new users of the software and all of the crazy, complicated, and sometimes downright evil stuff better known as Real Time Communication. We try very hard to save the users from such things.
The main purpose of the demonstration configuration in FreeSWITCH is to showcase all of the hundreds of parameters there are to work with. We present them to you in a working configuration that you could actually leave untouched and play with before trying your own hand at changing some of its options. Think of FreeSWITCH as a Lego set. FreeSWITCH and all of its little parts are like a brand new bucket Lego bricks, with plenty of parts to build anything we can imagine. The demonstration configuration is like the sample spaceship that you find in the instruction booklet. It contains step-by-step instructions on exactly how to build something you know will work. After you pick up some experience, you might start modifying your Lego ship to have extra features, or rebuild the parts into a car or some other creation. Obviously, you can leave outmany, or most, of the features built in that configuration and use only what is useful in your specific deployment. The good news about FreeSWITCH is that it comes out of the box already assembled. Therefore, unlike the bucket of Lego bricks, if you get frustrated and smash it to bits, you can just re-install the defaults and you won't have to build it again from scratch. The demonstration configuration is discussed in Chapter 3, Test Driving the Example Configuration.
Once FreeSWITCH has been installed, you only need to start its executable without changing one line in the configuration file. You will be immediately able to point a SIP telephone or software-based SIP softphone to the address of your server (be it your laptop, a virtual machine, a 48-core server, a Raspberry Pi, or an Amazon instance), make a test call, and access all of the functionalities of FreeSWITCH. Interfacing with other protocols will require additional configurations (such as installing SSL certificates for WebRTC and the like), but the end results will be exactly the same. If you have more than one phone, using the default configuration you should be able to configure them to each having an individual extension in the range 1000-1019, which is the extension number range that is predefined in the demonstration configuration. Once you get the phones registered, you will be able to make calls across them or have them meet in a conference room in the 3000-3399 range. If you call an extension that is not registered, or let the phone ring on another extension for too long, the voicemail application will use the phrase system to indicate that the party is not available, and ask you to record a message. If you dial 5000, you can see an example of the IVR system at work, presenting several menu choices demonstrating various other neat things FreeSWITCH can do. There are a lot of small changes and additions that can be made to the demonstration configuration while still leaving it intact.
For example, using the pre-processor directives we went over earlier, the demonstration configuration loads a list of files into the XML registry from certain places, meaning that every file in a particular folder will be combined into the final XML configuration document. The two most important points where this takes place are where the user accounts and the extensions in the Dialplan are kept. Each of the 20 extensions that are preconfigured with the defaults are stored into their own file. We could easily create a new file with a single user definition, drop it into place to add another user, and issue the reloadxml
command at the FreeSWITCH CLI. The same idea applies to the example Dialplan. We can put a single extension into its own file and load it into place whenever we want.