Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

Tech News - Data

1209 Articles
article-image-amazon-faces-increasing-public-pressure-as-hq2-plans-go-under-the-scanner-in-new-york
Natasha Mathur
05 Feb 2019
3 min read
Save for later

Amazon faces increasing public pressure as HQ2 plans go under the scanner in New York

Natasha Mathur
05 Feb 2019
3 min read
Andrea Stewart-Cousins, majority leader of the New York State Senate, and the senate democrats, nominated the New York State Senator, Michael Gianaris of Queens to serve on the five-member Public Authorities Control Board (PACB), yesterday. The news, first reported by the NY Times, has stirred up a worry among those who support Amazon’s HQ2 proposal to build a 25,000-person office in New York City (announced last year in November).  This is because Gianaris has been a vocal opponent of Amazon HQ2, and if selected, can veto the state actions on the project. “My position on the Amazon deal is clear and unambiguous and is not changing. It’s hard for me to say what I would do when I don’t know what it is I would be asked to opine on”, said Gianaris. The Amazon HQ2 deal for Long Island City was negotiated by Gov. Andrew Cuomo back in November 2018. “With Amazon committing to expand its headquarters in Long Island City, New York can proudly say that we have attracted one of the largest, most competitive economic development investments in U.S. history,” said Cuomo. He now has a final say over whether to refuse or approve the Senate’s selection. The day after Amazon announced its plans to build its 1.5 million square foot corporate headquarters in Long Island City, Queens, New York City, Gianaris started a protest against Amazon. Gianaris was joined by other New Yorkers who protested against the company’s plan, asking it to be abandoned.   https://twitter.com/SenGianaris/status/1062787029761753088 https://twitter.com/SenGianaris/status/1062693588457394176 Amazon’s new campus is supposed to be located along Long Island City’s waterfront, across the East River from Manhattan’s Midtown East neighborhood. Amazon has promised 50,000 jobs and will take in 25,000 employees with an average wage of $150,000 a year. Moreover, the company will receive at least $2.8 billion in incentives from the state and city and if it passes the goal of 25,000 workers in Long Island City, it could also receive state tax breaks. Gianaris does not approve of this as he believes that spending $2.8 billion in state and city incentives to Amazon is a “bad deal”. https://twitter.com/SenGianaris/status/1063066018694737920 He even went ahead to call it a ‘#Scamazon deal’. https://twitter.com/SenGianaris/status/1090632342719381504 Many people are in favor of Gianaris. According to Stuart Applebaum, President, Retail, Wholesale, and Department Store Union, Gianaris, has “proven himself to be a champion of workers’ rights”: https://twitter.com/RWDSU/status/1092536178073653248 Dani Lever, a spokeswoman for Cuomo, said that the recommendation of Gianaris “puts the self-interest of a flip-flopping opponent of the Amazon project above the state’s economic growth. Every Democratic Senator will now be called on to defend their opposition to the greatest economic growth potential this state has seen in over 50 years”. Amazon launches TLS Termination support for Network Load Balancer Sally Hubbard on why tech monopolies are bad for everyone: Amazon, Google, and Facebook in focus Rights groups pressure Google, Amazon, and Microsoft to stop selling facial surveillance tech to government
Read more
  • 0
  • 0
  • 12575

article-image-have-i-been-pwned-up-for-acquisition-troy-hunt-code-names-this-campaign-project-svalbard
Savia Lobo
12 Jun 2019
4 min read
Save for later

‘Have I Been Pwned’ up for acquisition; Troy Hunt code names this campaign ‘Project Svalbard’

Savia Lobo
12 Jun 2019
4 min read
Yesterday, Troy Hunt, revealed that his ‘Have I Been Pwned’(HIBP) website is up for sale, on his blogpost. Hunt has codenamed this acquisition as Project Svalbard and is working with KPMG to find a buyer. [box type="shadow" align="" class="" width=""]Troy Hunt has named Project Svalbard after the Svalbard Global Seed Vault, which is a secure seed bank on the Norwegian island of Spitsbergen. This vault represents the world’s largest collection of crop diversity with a long-term seed storage facility, for worst-case scenarios such as natural or man-made disasters.[/box] Commercial subscribers highly depend on HIBP to alert members of identity theft programs, enable infosec companies, provide services to their customers, protect large online assets from credential stuffing attacks, preventing fraudulent financial transactions and much more. Also,  governments around the world and the law enforcement agencies use HIBP to protect their departments and also for their investigations respectively. Hunt further says he has been handling everything alone. “to date, every line of code, every configuration and every breached record has been handled by me alone. There is no “HIBP team”, there’s one guy keeping the whole thing afloat”, he writes. However, in January, this year he discovered Collection #1 data breach which included 87 GB worth of data in a folder containing 12,000-plus files, nearly 773 email addresses, and more than 21 million unique passwords from data breaches going back to 2008. Hunt uploaded all of these breached data to HIBP and since then he says the site has seen a massive influx in activity, thus, taking him away from other responsibilities. “The extra attention HIBP started getting in Jan never returned to 2018 levels, it just kept growing and growing,” he says. Hunt said he was concerned about burnout, given the increasing scale and incidence of data breaches. Following this, he said it was time for HIBP to “grow up”. He also believed HIBP could do more in the space, including widening its capture of breaches. https://twitter.com/troyhunt/status/1138322112224083968 “There's a whole heap of organizations out there that don't know they've been breached simply because I haven't had the bandwidth to deal with it all,” Hunt said. “There's a heap of things I want to do with HIBP which I simply couldn't do on my own. This is a project with enormous potential beyond what it's already achieved and I want to be the guy driving that forward,” Hunt wrote. Hunt also includes a list of “commitments for the future of HIBP” in his blogpost. He also said he intended to be “part of the acquisition - that is some company gets me along with the project” and that “freely available consumer searches should remain freely available”. Via Project Svalbard, Hunt hopes to enable HIBP to reach out to more and more people and play “a much bigger role in changing the behavior of how people manage their online accounts.” A couple of commenters on the blog post ask Hunt whether he’s considered/approached Mozilla as a potential owner. In a reply to one he writes,“Being a party that’s already dependent on HIBP, I reached out to them in advance of this blog post and have spoken with them. I can’t go into more detail than that just now, but certainly their use of the service is enormously important to me.” To know more about this announcement in detail, read Troy Hunt’s official blogpost. A security researcher reveals his discovery on 800+ Million leaked Emails available online The Collections #2-5 leak of 2.2 billion email addresses might have your information, German news site, Heise reports Bo Weaver on Cloud security, skills gap, and software development in 2019
Read more
  • 0
  • 0
  • 12573

article-image-grafana-6-0-beta-is-here-with-new-panel-editor-ux-google-stackdriver-datasource-and-grafana-loki-among-others
Natasha Mathur
04 Feb 2019
4 min read
Save for later

Grafana 6.0 beta is here with new panel editor UX, google stackdriver datasource, and Grafana Loki among others

Natasha Mathur
04 Feb 2019
4 min read
Grafana, data visualization & analytics platform, released the beta version of Grafana 6.0, last week. Grafana 6.0 beta explores new features such as Explore, Grafana Loki, Gauge Panel, New panel editor UX, and Google stackdriver datasource among others. Grafana is an open source data visualization and monitoring tool that can be used on top of a variety of different data stores but is commonly used together with Graphite, InfluxDB, Elasticsearch, and Logz.io. Let’s discuss the key highlights in Grafana 6.0 beta. Explore Explore is a new feature in Grafana 6.0 beta that allows you to create a new interactive debugging workflow and helps integrate metrics and logs. The Prometheus query editor in Explore has improved autocomplete, metric tree selector, and integrations with the Explore table view. This allows easy label filtering and offers useful query hints that can automatically apply functions to your query. Also, there is no need to switch to other tools for debugging purposes, since Explore allows you to dig deeper into your metrics and logs to find the bug related cause. Grafana’s new logging datasource, called, Loki is also tightly integrated into Explore, enabling you to correlate metrics and logs by viewing them side-by-side. Explore supports splitting the view, allowing you to easily compare different queries, datasources, metrics and logs. Grafana Loki The log exploration and visualization features in Explore are available in any data source but have been currently implemented only by the new open source log aggregation system from Grafana Lab, called Grafana Loki. Grafana Loki is a horizontally-scalable, highly-available, and multi-tenant log aggregation system inspired by Prometheus. It is very cost effective as it does not index the contents of the logs but a set of labels for each log stream. The logs from Loki gets queried in a similar way to querying with label selectors in Prometheus. Loki makes use of labels to group log streams which can be made to match up with your Prometheus labels. New Panel Editor Grafana beta 6.0 has a new, redesigned UX around editing panels. The new panel editor lets you resize the visualization area in case the user wants more space for queries and options. It also allows you to change visualization (panel type) from within the new panel edit mode, hence, eliminating the need to add a new panel to try out different visualizations. Azure Monitor Datasource The Grafana team worked on developing an external plugin for Azure Monitor last year and it is now being moved into Grafana to be one of the built-in datasources. As a core datasource, the Azure Monitor datasource will be getting the alerting support for the official Grafana 6.0 release. The Azure Monitor datasource integrates four different Azure services with Grafana, namely, Azure Monitor, Azure Log Analytics, Azure Application Insights, and Azure Application Insights Analytics. Other changes Grafana 6.0 beta comes with a new and separate Gauge panel. Gauge Panel contains a new threshold editor that the team plans to refine and use in other panels. Built-in support for Google Stackdriver has been officially released in Grafana 6.0 beta. Grafana 6.0 beta comes with newly added support for provisioning alert notifiers from configuration files. This feature allows operators to provision notifiers without using the UI or the API. A new field called uid (string identifier) has been added that the administrator can set themselves. The ElasticSearch datasource in Grafana 6.0 beta now supports bucket script pipeline aggregations. This allows it to do per bucket computations such as the difference or ratio between two metrics. The color picker has been updated in Grafana to show named colors and primary colors. This will improve accessibility and will make colors more consistent across dashboards. For more information, check out the official Grafana 6.0 beta release notes. Grafana 5.3 is now stable, comes with Google Stackdriver built-in support, a new Postgres query builder Cortex, an open source, horizontally scalable, multi-tenant Prometheus-as-a-service becomes a CNCF Sandbox project Tumblr open sources its Kubernetes tools for better workflow integration
Read more
  • 0
  • 0
  • 12559

article-image-facebook-sets-aside-5-billion-in-anticipation-of-an-ftc-penalty-for-its-user-data-practices
Savia Lobo
25 Apr 2019
4 min read
Save for later

Facebook sets aside $5 billion in anticipation of an FTC penalty for its “user data practices”

Savia Lobo
25 Apr 2019
4 min read
Yesterday, Facebook in its first quarter financial reports revealed that it has to pay a sum of  $5 billion, a fine levied by the US Federal Trade Commission (FTC). This penalty is “in connection with the inquiry of the FTC into our platform and user data practices”, the company said. The company, in their report, mentioned that the expenses result in a 51% year-over-year decline in net income, to just $2.4bn. If they minus this one-time expense, Facebook’s earnings per share would have beaten analyst expectations, and its operating margin (22%) would have been 20 points higher. Facebook said, “We estimate that the range of loss in this matter is $3.0bn to $5.0bn. The matter remains unresolved, and there can be no assurance as to the timing or the terms of any final outcome.” In the wake of the Cambridge Analytica scandal, the FTC had commenced their investigation into Facebook’s privacy practices last year in March. This investigation was focussed whether the data practices that allowed Cambridge Analytica to obtain Facebook user data violated the company’s 2011 agreement with the FTC. “Facebook and the FTC have reportedly been negotiating over the settlement, which will dwarf the prior largest penalty for a privacy lapse, a $22.5m fine against Google in 2012”, The Guardian reports. Read Also: European Union fined Google 1.49 billion euros for antitrust violations in online advertising “Levying a sizable fine on Facebook would go against the reputation of the United States of not restraining the power of big tech companies”, The New York Times reports. Justin Brookman, a former official for the regulator who is currently a director of privacy at Consumers Union, nonprofit consumer advocacy group, said, “The F.T.C. is really limited in what they can actually do in enforcing a consent decree, but in the case of Facebook, they had public pressure on their side.” Christopher Wylie, a Research director at H&M and the Cambridge Analytica Whistleblower, voiced against Facebook by tweeting, “Facebook, you banned me for whistleblowing. You threatened @carolecadwalla and the Guardian. You tried to cover up your incompetent conduct. You thought you could simply ignore the law. But you can’t. Your house of cards will topple.” https://twitter.com/chrisinsilico/status/1121150233541525505 Senator Richard Blumenthal, Democrat of Connecticut, mentioned in a tweet, “Facebook must be held accountable — not just by fines — but also far-reaching reforms in management, privacy practices, and culture.” Debra Aho Williamson, an e-marketer analyst, warned that the expectation of an FTC fine may portend future trouble. “This is a significant development, and any settlement with the FTC may impact the ways advertisers can use the platform in the future,” she said. Jessica Liu, a marketing analyst for Forrester said that Facebook has to show signs that it’s improving on user data practices and content management. “Its track record has been atrocious. No more platitudes. What action is Facebook Inc actually taking?” “For Facebook, a $5 billion fine would amount to a fraction of its $56 billion in annual revenue. Any resolution would also alleviate some of the regulatory pressure that has been intensifying against the company over the past two and a half years”, the New York Times reports. To know more about this news in detail visit Facebook’s official press release. Facebook hires a new general counsel and a new VP of global communications even as it continues with no Chief Security Officer Facebook shareholders back a proposal to oust Mark Zuckerberg as the board’s chairperson “Is it actually possible to have a free and fair election ever again?,” Pulitzer finalist, Carole Cadwalladr on Facebook’s role in Brexit
Read more
  • 0
  • 0
  • 12547

article-image-pipelinedb-1-0-0-the-high-performance-time-series-aggregation-for-postgresql-released
Melisha Dsouza
25 Oct 2018
3 min read
Save for later

PipelineDB 1.0.0, the high performance time-series aggregation for PostgreSQL, released!

Melisha Dsouza
25 Oct 2018
3 min read
Three years ago, the PipelineDB team published the very first release of PipelineDB, as a fork of PostgreSQL. It received enormous support and feedback from thousands of organizations worldwide, including several Fortune 100 companies. It was highly requested that the fork be released as an extension of PostgreSQL. Yesterday, the team released PipelineDB 1.0.0 as a PostgreSQL extension under the liberal Apache 2.0 license. What is PipelineDB? PipelineDB can be used while storing huge amounts of time-series data that needs to be continuously aggregated. It only stores the compact output of these continuous queries as incrementally updated table rows, which can be evaluated with minimal query latency. It is used for analytics use cases that only require summary data, for instance, for real-time reporting dashboards. PipelineDB will sespeciallybe beneficial in scenarios where queries are known in advance. These queries can be run continuously in order to make the data infrastructure that powers these real time analytics applications simpler, faster, and cheaper as compared to the traditional “store first, query later” data processing model. How does PipelineDB work? PipelineDB uses SQL to write time-series events to a stream, which are also structured as tables. A continuous view is then used to perform an aggregation over this stream. Even if billions of rows are written to the stream, the continuous view ensures that only one physical row per hour is actually persisted within the database. Once the continuous view reads new incoming events and the distinct count is updated to reflect new information, the raw events will be discarded and not stored in PipelineDB. Which enables it to achieve: Enormous levels of raw event throughput on modest hardware footprints Extremely low read query latencies Traditional dependence between data volumes ingested and data volumes stored is broken All of this facilitates a high performance for the system which is sustained indefinitely. PipelineDB also supports another type of continuous queries  called ‘continuous transforms’. Continuous transforms are stateless and apply a transformation to a stream. They write out the result to another stream. Features of PipelineDB PipelineDB 1.0.0 has brought about some changes to version 0.9.7. The main highlights are as follows. Non-standard syntax has been removed. Configuration parameters are now qualified by pipelinedb. PostgreSQL pg_dump, pg_restore, and pg_upgrade tooling is now used instead of the PipelineDB variants Certain functions and aggregates are renamed to be descriptive about what problem they solve for the users . “Top-K” now represents Filtered-Space-Saving “Distributions” now refer to T-Digests “Frequency” now refers to Count-Min-Sketch Bloom filters introduced for set membership analysis Distributions and percentiles analysis is now possible What’s more? Continuous queries can be chained together into arbitrarily complex topologies of continuous computation. Each continuous query produces its own output stream of its incremental updates. This can be consumed by another continuous query as any other stream. The team aims to follow up with the functionality of automated partitioning for continuous views in the upcoming release. You can head over to the PipelineDb blog for more insights on this news. Citus Data to donate 1% of its equity to non-profit PostgreSQL organizations PostgreSQL 11 is here with improved partitioning performance, query parallelism, and JIT compilation PostgreSQL group releases an update to 9.6.10, 9.5.14, 9.4.19, 9.3.24
Read more
  • 0
  • 0
  • 12547

article-image-data-transfer-project-now-apple-joins-google-facebook-microsoft-and-twitter-to-make-data-sharing-seamless
Vincy Davis
01 Aug 2019
2 min read
Save for later

Data Transfer Project: Now Apple joins Google, Facebook, Microsoft and Twitter to make data sharing seamless

Vincy Davis
01 Aug 2019
2 min read
Yesterday, Data Transfer Project (DTP) updated on their website that Apple has officially joined the project as a contributor, along with other tech giants like Google, Facebook, Microsoft and Twitter. Read More: Google, Microsoft, Twitter, and Facebook team up for Data Transfer Project The Data Transfer Project launched in 2018, is an open-source, service-to-service data portability platform which allows individuals to move their data across the web, whenever they want. The seamless transfer of data aims to give users more control of their data across the web. It’s tools will make it possible for users to port their music playlists, contacts or documents from one social network to another, without much effort. Currently, the DTP has 18 contributors. Their partners and open source community have inserted more than 42,000 lines of code and changed more than 1,500 files in the Project. Other alternative social networks like Deezer, Mastodon, and Solid have also joined the project. New Cloud logging and monitoring framework features and new APIs from Google Photos and Smugmug have also been added. The Data Transfer Project is still in the development stage, as its official site states that “We are continually making improvements that might cause things to break occasionally. So as you are trying things please use it with caution and expect some hiccups.” It's Github page has regular updates since its launch and has 2,480 stars, 209 forks and 187 watchers currently. Many users are happy that Apple has also joined the Project, as this means easy transfer of data for them. https://twitter.com/backlon/status/1156259766781394944 https://twitter.com/humancell/status/1156549440133632000 https://twitter.com/BobertHepker/status/1156352450875592704 Some users suspect that such projects will encourage unethical sharing of user data. https://twitter.com/zananeichan/status/1156416593913667585 https://twitter.com/sarahjeong/status/1156313114788241408 Visit the Data Transfer Project website for more details. Google Project Zero reveals six “interactionless” bugs that can affect iOS via Apple’s iMessage Softbank announces a second AI-focused Vision Fund worth $108 billion with Microsoft, Apple as major investors Apple advanced talks with Intel to buy its smartphone modem chip business for $1 billion, reports WSJ
Read more
  • 0
  • 0
  • 12537
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at €18.99/month. Cancel anytime
article-image-facebook-open-sources-hyperparameter-autotuning-for-fasttext-to-automatically-find-best-hyperparameters-for-your-dataset
Amrata Joshi
27 Aug 2019
3 min read
Save for later

Facebook open-sources Hyperparameter autotuning for fastText to automatically find best hyperparameters for your dataset

Amrata Joshi
27 Aug 2019
3 min read
Two years ago, the team at Facebook AI Research (FAIR) lab open-sourced fastText, a library that is used for building scalable solutions for text representation and classification. To make models work efficiently on datasets with large number of categories, finding the best hyperparameters is crucial. However, searching the best hyperparameters manually is difficult as the effect of each parameter varies from one dataset to another. For this, Facebook has developed an autotune feature in FastText that automatically finds the best hyperparameters for your dataset. Yesterday, they announced that they are open-sourcing the Hyperparameter autotuning feature for fastText library.  What are hyperparameters? Hyperparameters are the parameter whose values are fixed before the training process begins. They are the critical components of an application and they can be tuned in order to control how a machine learning algorithm behaves. Hence it is important to search for the best hyperparameters as the performance of an algorithm can be majorly dependent on the selection of these hyperparameters. The need for Hyperparameter autotuning It is difficult and time-consuming to search for the best hyperparameters manually, even for expert users. This new feature makes this task easier by automatically determining the best hyperparameters for building an efficient text classifier. A researcher can input the training data, a validation set and a time constraint to use autotuning. The researcher can also constrain the size of the final model with the help of compression techniques in fastText. Building a size-constrained text classifier can be useful for even deploying models on devices or in the cloud such that it becomes easier to maintain a small memory footprint.  With Hyperparameter autotuning, researchers can now easily build a memory-efficient classifier that can be used for various tasks, including language identification, sentiment analysis, tag prediction, spam detection, and topic classification. The team’s strategy of exploring various hyperparameters is inspired by existing tools, such as Nevergrad, but has been tailored to fastText for using the specific structure of models. The autotune feature explores hyperparameters by initially sampling in a large domain that shrinks around the best combinations over time.  It seems that this new feature could possibly be a competitor to Amazon SageMaker Automatic Model Tuning. In Amazon’s model, however, the user needs to select the hyperparameters required to be tuned, a range for each parameter to explore, and also the total number of training jobs. While Facebook’s Hyperparameter autotuning automatically selects the hyperparameters.  To know more about this news, check out Facebook’s official blog post. Twitter and Facebook removed accounts of Chinese state-run media agencies aimed at undermining Hong Kong protests Facebook must face privacy class action lawsuit, loses facial recognition appeal, U.S. Court of Appeals rules Facebook research suggests chatbots and conversational AI are on the verge of empathizing with humans    
Read more
  • 0
  • 0
  • 12526

article-image-what-can-we-expect-from-tensorflow-2-0
Savia Lobo
17 Sep 2018
3 min read
Save for later

What can we expect from TensorFlow 2.0?

Savia Lobo
17 Sep 2018
3 min read
Last month, Google announced that the TensorFlow community plans to release a preview of TensorFlow 2.0, later this year. However, the date for the preview release has not been disclosed yet. The 2.0 version will include major highlights such as improved eager execution, improved compatibility, support for more platforms and languages, and much more. Key highlights in Tensorflow 2.0 Eager execution would be an important feature of TensorFlow 2.0. It aids in aligning users’ expectations about the programming model better, with TensorFlow practice. This will thus make TensorFlow easier to learn and apply. This version includes a support for more platforms and languages. It will provide an improved compatibility and parity between these components via standardization on exchange formats and alignment of APIs. The community plans to remove deprecated APIs and reduce the amount of duplication, which has caused confusion for users. Other improvements in TensorFlow 2.0 Increased Compatibility and continuity TensorFlow 2.0  would be an opportunity to correct mistakes and to make improvements which are otherwise restricted under semantic versioning. The community plans to create a conversion tool which updates the Python code to use TensorFlow 2.0 compatible APIs, to ease the transition for users. This tool will also warn in cases where conversion is not possible automatically. A similar tool helped tremendously during the transition to 1.0. As not all changes can be made fully, automatically, the community plans to deprecate APIs, some of which do not have a direct equivalent. For such cases, they will offer a compatibility module (tensorflow.compat.v1) which contains the full TensorFlow 1.x API, and will be maintained through the lifetime of TensorFlow 2.x. On-disk compatibility The community would not be making any breaking changes to SavedModels or stored GraphDefs repositories. This means they will include all current kernels in 2.0 (i.e., we plan to include all current kernels in 2.0). However, the changes in 2.0 will mean that variable names in raw checkpoints might have to be converted before being compatible with new models. Improvements to tf.contrib As part of releasing TensorFlow 2.0, the community will stop distributing tf.contrib. For each of the contrib modules they plan to  either: integrate the project into TensorFlow, move it to a separate repository, or remove it entirely. This means that all of tf.contrib will be deprecated, and the community will stop adding new tf.contrib projects. Following is a YouTube video by Aurélien Géron explaining the changes in TensorFlow 2.0 in detail. https://www.youtube.com/watch?v=WTNH0tcscqo Understanding the TensorFlow data model [Tutorial] TensorFlow announces TensorFlow Data Validation (TFDV) to automate and scale data analysis, validation, and monitoring Intelligent mobile projects with TensorFlow: Build your first Reinforcement Learning model on Raspberry Pi [Tutorial]
Read more
  • 0
  • 0
  • 12516

article-image-diversity-in-faces-ibm-researchs-new-dataset-to-help-build-facial-recognition-systems-that-are-fair
Sugandha Lahoti
30 Jan 2019
2 min read
Save for later

Diversity in Faces: IBM Research’s new dataset to help build facial recognition systems that are fair

Sugandha Lahoti
30 Jan 2019
2 min read
IBM research has released ‘Diversity in Faces’ (DiF) dataset which will help build better and diverse facial recognition systems by ensuring fairness. The DiF provides a dataset of annotations of 1 million human facial images. This dataset was built using publicly available images from the YFCC-100M Creative Commons data set. Building facial recognition systems that meet fairness expectations, has been a long-standing goal for AI researchers. Most AI systems learn through datasets. If not trained with robust and diverse data sets, accuracy and fairness are at risk. For that reason, AI developers and the research community need to be thoughtful about what data they use for training. With the new DiF dataset, IBM researchers are building a strong, fair, and diverse dataset. The DiF data set does not just measure different faces by age, gender, and skin tone. It also looks at other intrinsic facial features that include craniofacial distances, areas and ratios, facial symmetry and contrast, subjective annotations, and pose and resolution. IBM annotated the faces using 10 well-established and independent coding schemes from the scientific literature. These 10 coding schemes were selected based on their strong scientific basis, computational feasibility, numerical representation, and interpretability. Through thorough statistical analysis, IBM researchers found that the DiF dataset provided a more balanced distribution and broader coverage of facial images compared to previous datasets. Their analysis of the 10 initial coding schemes also provided them with an understanding of what is important for characterizing human faces. In the future, they plan to use Generative Adversarial Networks (GANs) to possibly generate faces of any variety to synthesize training data as needed. They will also find ways (and encourage others as well) to improve on the initial ten coding schemes and add new ones. You can request access to the DiF dataset on IBM website. You can also read the research paper for more information. Admiring the many faces of Facial Recognition with Deep Learning Facebook introduces a fully convolutional speech recognition approach and open sources wav2letter++ and flashlight AWS updates the face detection, analysis and recognition capabilities in Amazon Rekognition
Read more
  • 0
  • 0
  • 12505

article-image-fitness-app-polar-reveals-military-secrets
Richard Gall
09 Jul 2018
3 min read
Save for later

Fitness app Polar reveals military secrets

Richard Gall
09 Jul 2018
3 min read
You might remember that back in January, fitness app Strava was revealed to be giving away military secrets. The app, when used by military personnel, was giving the location of some potentially sensitive information. Well, it's happening again - this time another fitness app, Polar, is unwittingly giving up sensitive military locations. The digital investigation organization Bellingcat was able to scrape data from 200 sites around the world. From this, it gained information on exercises by nearly 6,500 Polar users. The level of detail Bellingcat was able to gain was remarkable. It was not only able to learn more about military locations - information that could be critical to national security - but also a startling level of information about the people that work on them. The investigation echoes the Strava data leak. It emphasizes the (disturbing) privacy issues that fitness tracking applications have been unable to confront. But Bellingcat explains that Polar is actually one of the worst apps for publicizing private data. On Strava and Garmin, for example, it's only possible to see individual exercises done by users. "Polar makes it far worse by showing all the exercises of an individual done since 2014, all over the world on a single map." Polar is reveals dangerous levels of detail about its users Some of the information found by Bellingcat is terrifying. For example: "A high-ranking officer of an airbase known to host nuclear weapons can be found jogging across the compound in the morning. From a house not too far from that base, he started and finished many more runs on early Sunday mornings. His favorite path is through a forest, but sometimes he starts and ends at a car park further away. The profile shows his full name." The investigators also revealed they were able to cross-reference profiles with social media profiles. This could allow someone to build up a very detailed picture of a member of the military or security personnel. Some of these people have access to nuclear weapons. Bellingcat's advice to fitness app users Bellingcat offers some clear advice to anyone using fitness tracking apps like Polar. Most of it sounds obvious, but it's clear that even people that should be particularly careful aren't doing it.  "As always, check your app-permissions, try to anonymize your online presence, and, if you still insist on tracking your activities, start and end sessions in a public space, not at your front door." The results of the investigation are, perhaps, just another piece in a broader story emerging this year about techno-scepticism. Problems with tech have always existed, it's only now that those are really surfacing and seem to be taking on a new urgency. This is going to have implications for the military for sure, but it is also likely to have an impact on the way these applications are built in the future. Read next The risk of wearables – How secure is your smartwatch? Computerizing our world with wearables and IoT
Read more
  • 0
  • 0
  • 12476
article-image-elasticsearch-6-5-is-here-with-cross-cluster-replication-and-jdk-11-support
Natasha Mathur
16 Nov 2018
4 min read
Save for later

Elasticsearch 6.5 is here with cross-cluster replication and JDK 11 support

Natasha Mathur
16 Nov 2018
4 min read
The Elastic team released version 6.5.0 of their open source distributed, RESTful search and analytics engine, Elasticsearch, earlier this week. Elasticsearch 6.5.0 explores features such as cross-cluster replication, new source-only snapshots, SQL/ODBC changes, and security features, among others. Elasticsearch is a search engine based on Lucene library that provides a distributed, multitenant-capable full-text search engine with an HTTP web interface as well as schema-free JSON documents. Let’s now discuss these features in Elasticsearch 6.5.0. Cross-cluster replication Elasticsearch 6.5.0 comes with cross-cluster replication which is a Platinum-level feature for Elasticsearch. Cross-cluster replication allows you to create an index in a local cluster to follow an index in a remote cluster or automatically-follow indices in a remote cluster matching a pattern. New source-only snapshots Elasticsearch 6.5 comes with a new source-only snapshot that allows you to store a minimal amount of information (namely, the _source and index metadata). This enables the indices to be rebuilt through a reindex operation when necessary. What’s great about this is that it creates up to 50% reduction in disk space of the snapshots. However, they can take a longer time to restore (in full) as you’ll need to do a reindex to make them searchable. SQL/ODBC changes An initial (alpha status) ODBC driver has been added for Elasticsearch 6.5.0. Since ODBC is supported by many BI tools, it makes it easy to connect Elasticsearch to a lot of your favourite 3rd party tools giving you the speed, flexibility, and power of full-text search and relevance. Other than that, few new functions and capabilities have also been added to Elasticsearch’s SQL capabilities. These include ROUND, TRUNCATE, IN, MONTHNAME, DAYNAME, QUARTER, CONVERT, as well as a number of string manipulation functions such as CONCAT, LEFT, RIGHT, REPEAT, POSITION, LOCATE, REPLACE, SUBSTRING, and INSERT. You can now also query across indices, with different mappings, given that the mapping types are compatible. New scriptable token filters Elasticsearch 6.5 introduces new scriptable token filters namely, predicate and conditional. The predicate token filter allows you to remove tokens that don’t match a script.  The conditional token filter builds on the idea of scriptable token filters but lets you apply other token filters matching a script. These let you manipulate the data you’re indexing without requiring to write a Java plugin. Moreover, Elasticsearch 6.5 also comes with a new text type, called annotated_text. This new annotated_text type allows you to use markdown-like syntax to then link to different entities in applications using natural language processing. JDK 11 and G1GC Elasticsearch 6.5 offers support for JDK 11. Other than that, Elasticsearch 6.5 also supports the G1 garbage collector on JDK 10+. Security and Audit Logging Elasticsearch 6.5 comes with two new security features, namely, authorization realms and audit logging. Authorization realms enable an authenticating realm to delegate the task of pulling the user information (with the username, the user’s roles, etc) to one or more other realms. Audit logging is a new, completely structured format, where all attributes are named, meaning each log entry is a one-line JSON document and each one of these are printed on a separate line. These attributes are ordered like in any other normal log entry. Multi-bucket analysis A multi-metric machine learning job analyzes multiple time series together. Elasticsearch 6.5 introduces multi-bucket analysis for machine learning jobs. Here, features from multiple contiguous buckets are used for anomaly detection. The final anomaly score includes a combination of values from both the “standard” single-bucket analysis and the new multi-bucket analysis. Additionally, Elasticsearch 6.5, comes with an experimental find file structure API which aims to help discover the structure of a text file. It attempts to read the file and on succeeding returns statistics about the common values of the detected fields and mappings that can be used for ingesting the file into Elasticsearch. For more information, check out the official Elasticsearch 6.5 blog. Dejavu 2.0, the open source browser by ElasticSearch, now lets you build search UIs visually Search company Elastic goes public and doubles its value on day 1 How does Elasticsearch work? [Tutorial]
Read more
  • 0
  • 0
  • 12469

article-image-facebook-tweet-explains-server-config-change-for-14-hour-outage-on-all-its-platforms
Fatema Patrawala
15 Mar 2019
5 min read
Save for later

Facebook tweet explains 'server config change' for 14-hour outage on all its platforms

Fatema Patrawala
15 Mar 2019
5 min read
Facebook has said a "server configuration change" was to blame for an 14-hour outage of its services, which took down the Facebook social media service, its Messenger and WhatsApp, and Instagram apps. "Yesterday, as a result of a server configuration change, many people had trouble accessing our apps and services. We've now resolved the issues and our systems are recovering. We’re very sorry for the inconvenience and appreciate everyone’s patience," said Facebook in a tweet. The outage started at around 09:00 Pacific Time (16:00 UTC) on Wednesday and wasn't fully resolved until 23:00 (06:00 UTC) – an extraordinary delay for a service used by billions globally. That brief and vague explanation – with no promise of an in-depth report to come – has left users and observers surprised and disappointed. Any company providing a service of similar size and impact, such as a network operator, would be expected to provide constant updates and make its executives available to publicly explain what went wrong. It's not like Facebook is allergic to revealing technical details about itself: it has a whole sub-site dedicated to its internal software and data-center engineering work, though there's not a word about its latest outage. In contrast, Google suffered a cloud platform outage, too, for about four hours yesterday, and its postmortem is detailed: a key part of its backend storage system was overloaded with requests after changes were made to counter a sudden demand for space to hold object metadata, ultimately causing services to stall. Similarly in January Microsoft faced an outage of approximately 4 hours which affected its various cloud services. They identified it as third party network provider issue affecting authentication to Microsoft accounts and they immediately shifted them to an alternate network provider. Further providing the users a detailed report on the outage issue. Unlike almost every other company running a communications service for millions of users, Facebook does not even provide a system status dashboard for the public. It has a dashboard for app developers. "We are currently experiencing issues that may cause some API requests to take longer or fail unexpectedly. We are investigating the issue and working on a resolution," it noted a few hours ago, somewhat stating the bleeding obvious. While communications companies go out of their way to reach out to media outlets and explain major multi-hour outages in order to maintain public confidence in their network. Facebook seems to feel no obligation to do so. We need fair explanation! Digging into the limited explanation of a "server configuration change" as the source of the problem, that terminology is so vague as to be useless: What sort of change? On what servers? What was the change intended to achieve? Was it tested beforehand? Was it rolled out gradually, or suddenly across all regions – and if the latter, why? Why was a rollback not immediately initiated? And if it was, why didn't it work? Why did it take 14 hours to resolve? These some are questions that you would expect a huge technology company to provide answers to. Instead, the best explanation we've found is a hypothetical rundown by Facebook's former chief information security officer Alex Stamos who assumes that Facebook engineers did initiate an automated rollback but that "the automated system doesn't know how to handle the problem, and gets stuck in some kind of loop that causes more damage. Humans have to step in, stop it, and restart a complex web of interdependent services on hundreds of thousands of systems." Just this month, US Senator Elizabeth Warren (D-MA) made the argument that services like Facebook, Google, and Amazon have become so large and so fundamental in the digital era that they should be viewed – and legislated as – "platform utilities" and the revenue making aspects (products, ads etc) of these companies should be broken off as separate entities. When Facebook even refuses to provide a proper explanation for a 14-hour outage, the argument that there needs to be legislative oversight only grows stronger. Related to this, yesterday it was revealed by New York Times that Facebook is being investigated by a grand jury for possible criminal charges for sharing people's private data with other companies without seeking the consent of, or even informing, those that were affected. Is there more to this than meets the eye? The other big question is how a "server configuration change" led to not just Facebook but also its Messenger, WhatsApp, and Instagram services going down. One theory which could float around is that Facebook has either connected them up or attempted to connect them up at a low level, merging them into one broad platform. In January, CEO Mark Zuckerberg had announced that his instant-chat applications and social network be intertwined. Was the outage as a result of Facebook trying to combine systems and get ahead of regulators, especially when this month, an open debate opened up over whether Facebook's takeover of Instagram and WhatsApp should be rolled back? The timing of it all makes today’s breaking news of two important top executives leaving Facebook in less than a year, even more enigmatic. CEO Mark Zuckerberg writes about the departure of Chris Cox, Chief Product Officer and Chris Daniel, Whatsapp Vice President on his blog. We wait and watch for Facebook to come up with detailed explanation though very much unlikely of them. Facebook family of apps hits 14 hours outage, longest in its history Facebook under criminal investigations for data sharing deals: NYT report Facebook deletes and then restores Warren’s campaign ads after she announced plans to break up Facebook
Read more
  • 0
  • 0
  • 12464

article-image-huawei-launches-hiai
Richard Gall
04 Apr 2018
2 min read
Save for later

Huawei launches HiAI

Richard Gall
04 Apr 2018
2 min read
Huawei launched the P20 to considerable acclaim. But the launch features news that's even more exciting - particularly if you're a machine learning developer/aficionado. The Chinese telecoms giant has launched HiAI, its artificial intelligence engine, to coincide with the P20's release. What is HiAI? HiAI is Huawei's AI engine. It will power applications on the new P20 mobile, giving users an experience that contains some of the most exciting artificial intelligence capabilities on the planet. But more importantly, it will also open up new opportunities for mobile developers and machine learning engineers. Engineers can now download the Driver Development Kit (DDK), IDE and SDK to begin using HiAI. HiAI's key features Huawei has made sure HiAI brings a range of artificial intelligence features - it certainly looks like it should be enough to ensure they are competing with other innovators in the space. Here are some of the key features of the software: Automatic Speech Recognition - this isn't available outside of China at the moment. Essentially, it turns human speech into text. Natural Language Understanding engine - The Natural Language Understanding engine complements the work done by the ASR engine above. Essentially, it allows a computer to 'interpret' various dimensions of human language and 'act' accordingly. Computer vision - Computer vision is what makes a number of popular mobile apps possible - for example, in aging software, or even Snapchat where you can add filters. HiAI includes a computer vision engine which is capable of facial and object recognition HiAI is going to only make Huawei's new phone even better - the more applications that are able to utilize artificial intelligence, the more attractive the phone will be to consumers. Certainly, Huawei is an underrated giant of the telecoms space, particularly when it comes to consumer tech. With its new artificial intelligence engine, it might have created something that could be the beginning of more success and greater market share outside of China. Learn more on Huawei's website. Source: XDA
Read more
  • 0
  • 0
  • 12463
article-image-cockroach-labs-announced-managed-cockroachdb-as-a-service
Amrata Joshi
31 Oct 2018
3 min read
Save for later

Cockroach Labs announced managed CockroachDB-as-a-Service

Amrata Joshi
31 Oct 2018
3 min read
This week, Cockroach Labs announced the availability of Managed CockroachDB. CockroachDB,  a geo-distributed database, is a fully hosted and managed service, created and run by Cockroach Labs. It is an open source tool that makes deploying, scaling, and managing CockroachDB effortless. Last year, the company announced version 1.0 of CockroachDB and $27 million in Series B financing, which was led by Redpoint along with the participation from Benchmark, GV, Index Ventures and FirstMark. Managed CockroachDB is also cloud agnostic and available on AWS and GCP. The goal is to allow development teams to focus on building highly scalable applications without worrying about infrastructure operations. CockroachDB’s design makes data easy by providing an industry-leading model for horizontal scalability and resilience to accommodate fast-growing businesses. It also improves the ability to move data closer to the customers depending upon their geo-location. [box type="shadow" align="" class="" width=""] Fun Fact: Why the name ‘Cockroach’? In a post, published by Cockroach Labs, three years back, Spencer Kimball, CEO at Cockroach Labs, said, “You’ve heard the theory that cockroaches will be the only survivors post-apocalypse? Turns out modern database systems have a lot to gain by emulating one of nature’s oldest and most successful designs. Survive, replicate, proliferate. That’s been the cockroach model for geological ages, and it’s ours too.” [/box] Features of  Managed CockroachDB Always on service: Managed CockroachDB is an always on service for critical applications as it automatically replicates data across three availability zones for single region deployments. As a globally scalable distributed SQL database, CockroachDB also supports geo-partitioned clusters at whatever scale the business demands. Cockroach Labs manages the hardware provisioning, setup and configuration for the managed clusters so that they are optimized for performance. Since CockroachDB is cloud agnostic, one can migrate from one cloud service provider to another at peak load with zero downtime. Automatic upgrades to the latest releases, and hourly incremental backups of the data makes the working more easier. The Cockroach Labs team provides, 24x7 monitoring, and enterprise grade security for all the customers. CockroachDB provides the capabilities for building ultra-resilient, high-scale and global applications. It features distributed SQL with ACID (Atomicity, Consistency, Isolation, Durability) transactions. Features like, cluster visualization, priority support, native JSON support and automated scaling makes it even more unique. Read more about this announcement on the Cockroach Labs official website. SQLite adopts the rule of St. Benedict as its Code of Conduct, drops it to adopt Mozilla’s community participation guidelines, in a week MariaDB acquires Clustrix to give database customers ‘freedom from Oracle lock-in’ Why Neo4j is the most popular graph database
Read more
  • 0
  • 0
  • 12462

article-image-facebook-open-sources-the-elf-opengo-project-and-retrains-the-model-using-reinforcement-learning
Sugandha Lahoti
20 Feb 2019
3 min read
Save for later

Facebook open sources the ELF OpenGo project and retrains the model using reinforcement learning

Sugandha Lahoti
20 Feb 2019
3 min read
Facebook has open sourced it’s ELF OpenGo project and added new features to it. Facebook’s ELF OpenGo is a reimplementation of AlphaGoZero / AlphaZero. Last year in May, ELF OpenGo was released to allow AI researchers to better understand how AI systems learn. This open-source bot had a 20-0 record against top professional Go players and has been widely adopted by the AI research community to run their own Go experiments. Now, the Facebook AI Research team has announced new features and research results related to ELF OpenGo. They have now retrained the model of ELF OpenGo using reinforcement learning and have also released a Windows executable version of the bot, which can be used as a training aid for Go players. A unique archive that shows ELF OpenGo's analysis of 87,000 professional Go games is also released. This will help Go players assess their performance in detail. They are also releasing their data set of 20 million self-play games and the 1,500 intermediate models. Facebook researchers have shared their experiments and learnings of retraining the ELF OpenGo model in a new research paper. The paper details the results of extensive experiments, modifying individual features during evaluation to better understand the properties of these kinds of algorithms. Training ELF OpenGo ELF OpenGo was trained on 2,000 GPUs for 9 days. Post that, the 20-block model was comparable to the 20-block models described in AlphaGo Zero and Alpha Zero. The model was also provided with pretrained superhuman models, the code used to train the models, a comprehensive training trajectory dataset featuring 20 million self-play games, over 1.5 million training mini batches, and auxiliary data. Model behavior during training There is high variance in the model’s strength when compared to other models. This property holds even if the learning rates are reduced. Moves that require significant lookahead to determine whether they should be played, such as “ladder” moves, are learned slowly by the model and are never fully mastered. The model quickly learns high quality moves at different stages of the game. In contrast to the typical behavior of tabular RL, the rate of progression for learning both mid-game and end-game moves is nearly identical. In a Facebook blog post, the team behind this RL model wrote “We're excited that our development of this versatile platform is helping researchers better understand AI, and we're gratified to see players in the Go community use it to hone their skills and study the game. We're also excited to expand last year's release into a broader suite of open source resources” The research paper titled ELF OpenGo: An Analysis and Open Reimplementation of AlphaZero is available on arXiv. Google DeepMind’s AI AlphaStar beats StarCraft II pros TLO and MaNa; wins 10-1 against the gamers. Deepmind’s AlphaZero shows unprecedented growth in AI, masters 3 different games FAIR releases a new ELF OpenGo bot with a unique archive that can analyze 87k professional Go games
Read more
  • 0
  • 0
  • 12453
Modal Close icon
Modal Close icon